[openstack-dev] [Fuel] Stop deployment can break production cluster. How we should avoid it?

Vladimir Sharshov vsharshov at mirantis.com
Fri Jan 22 18:15:15 UTC 2016


Hi!

I also vote for solution "mark a cluster 'operational' after successful
deployment". It is simple and guarantee that we do not erase main
components.
Also it will free resources to support stop/rerun(resume) feature on task
based deployment which will works much better (without node destroy as side
affect)

On Fri, Jan 22, 2016 at 8:09 PM, Igor Kalnitsky <ikalnitsky at mirantis.com>
wrote:

> Dmitry,
>
> > We can mark a cluster 'operational' after successful deployment. And we
> > can disable 'stop' button on this kind of clusters.
>
> I think this is a best solution so far. Moreover, I don't know how to
> fix it properly since there could be a lot of questions how this
> button should behave at all.
>
> Taking into account all this, I propose to solve this issue as a
> blueprint (so we can think and cover all edge cases in the spec) or
> drop stop button functionality at all.
>
> The latest, perhaps, may be a good solution. I don't know how often
> someone use Stop deployment.
>
>
> Bogdan,
>
> > This is the critical issue. The *worst* of possible situations for
> > cluster operations. I believe this should be covered by a dedicated
> > bulletin issued, the stop action shall be disabled for all releases as
> > emergency fix, and fixed by next maintenance updates.
>
> It wasn't always the case. Some time ago we didn't execute any tasks
> on controllers when adding new nodes. It's become a case, I assume,
> since Fuel 8.0, when we start executing netconfig and other puppet
> task on each deployment run.
>
> So we need to investigate in which release we have introduced
> re-execution some tasks on controllers, and only then thinking about
> bulletins.
>
>
> Thanks,
> Igor
>
> On Fri, Jan 22, 2016 at 1:06 PM, Bogdan Dobrelya <bdobrelia at mirantis.com>
> wrote:
> > On 22.01.2016 11:45, Dmitry Pyzhov wrote:
> >> Guys,
> >>
> >> There is a tricky bug with our 'stop deployment'
> >> feature: https://bugs.launchpad.net/fuel/+bug/1529691
> >>
> >> It cannot be fixed easily because it is a design flaw. By design we
> >> cannot leave a node in unpredictable state. So we move all nodes that
> >> are not in ready state back to bootstrap.
> >>
> >> But when user adding a node and deploying cluster system reruns puppet
> >> on controllers. If user press 'stop' button controllers will be erased.
> >> Cluster will be destroyed. Definitely this is not expected behaviour.
> >
> > This is the critical issue. The *worst* of possible situations for
> > cluster operations. I believe this should be covered by a dedicated
> > bulletin issued, the stop action shall be disabled for all releases as
> > emergency fix, and fixed by next maintenance updates.
> >
> >>
> >> Taking into account that we are going to rewrite this feature in 9.0 and
> >> we are close to HCF there is no value in major changes for this feature
> >> in 8.0. Let's do a simple workaround.
> >>
> >> We can mark a cluster 'operational' after successful deployment. And we
> >> can disable 'stop' button on this kind of clusters.
> >>
> >> Any concerns or other proposals?
> >>
> >>
> >>
> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Irc #bogdando
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160122/9869ffde/attachment.html>


More information about the OpenStack-dev mailing list