[openstack-dev] [Fuel] Stop deployment can break production cluster. How we should avoid it?

Kyrylo Galanov kgalanov at mirantis.com
Sat Jan 23 14:08:16 UTC 2016


Hello,

Why don't we introduce additional state for nodes like 're-deploying'. If
deployment was stopped we don't erase nodes with this state, but change the
status to 'error' or 'ready' , for example.
Or we can add warning message that 'stop' button would destroy every and
each node.

On Fri, Jan 22, 2016 at 8:15 PM, Vladimir Sharshov <vsharshov at mirantis.com>
wrote:

> Hi!
>
> I also vote for solution "mark a cluster 'operational' after successful
> deployment". It is simple and guarantee that we do not erase main
> components.
> Also it will free resources to support stop/rerun(resume) feature on task
> based deployment which will works much better (without node destroy as side
> affect)
>
> On Fri, Jan 22, 2016 at 8:09 PM, Igor Kalnitsky <ikalnitsky at mirantis.com>
> wrote:
>
>> Dmitry,
>>
>> > We can mark a cluster 'operational' after successful deployment. And we
>> > can disable 'stop' button on this kind of clusters.
>>
>> I think this is a best solution so far. Moreover, I don't know how to
>> fix it properly since there could be a lot of questions how this
>> button should behave at all.
>>
>> Taking into account all this, I propose to solve this issue as a
>> blueprint (so we can think and cover all edge cases in the spec) or
>> drop stop button functionality at all.
>>
>> The latest, perhaps, may be a good solution. I don't know how often
>> someone use Stop deployment.
>>
>>
>> Bogdan,
>>
>> > This is the critical issue. The *worst* of possible situations for
>> > cluster operations. I believe this should be covered by a dedicated
>> > bulletin issued, the stop action shall be disabled for all releases as
>> > emergency fix, and fixed by next maintenance updates.
>>
>> It wasn't always the case. Some time ago we didn't execute any tasks
>> on controllers when adding new nodes. It's become a case, I assume,
>> since Fuel 8.0, when we start executing netconfig and other puppet
>> task on each deployment run.
>>
>> So we need to investigate in which release we have introduced
>> re-execution some tasks on controllers, and only then thinking about
>> bulletins.
>>
>>
>> Thanks,
>> Igor
>>
>> On Fri, Jan 22, 2016 at 1:06 PM, Bogdan Dobrelya <bdobrelia at mirantis.com>
>> wrote:
>> > On 22.01.2016 11:45, Dmitry Pyzhov wrote:
>> >> Guys,
>> >>
>> >> There is a tricky bug with our 'stop deployment'
>> >> feature: https://bugs.launchpad.net/fuel/+bug/1529691
>> >>
>> >> It cannot be fixed easily because it is a design flaw. By design we
>> >> cannot leave a node in unpredictable state. So we move all nodes that
>> >> are not in ready state back to bootstrap.
>> >>
>> >> But when user adding a node and deploying cluster system reruns puppet
>> >> on controllers. If user press 'stop' button controllers will be erased.
>> >> Cluster will be destroyed. Definitely this is not expected behaviour.
>> >
>> > This is the critical issue. The *worst* of possible situations for
>> > cluster operations. I believe this should be covered by a dedicated
>> > bulletin issued, the stop action shall be disabled for all releases as
>> > emergency fix, and fixed by next maintenance updates.
>> >
>> >>
>> >> Taking into account that we are going to rewrite this feature in 9.0
>> and
>> >> we are close to HCF there is no value in major changes for this feature
>> >> in 8.0. Let's do a simple workaround.
>> >>
>> >> We can mark a cluster 'operational' after successful deployment. And we
>> >> can disable 'stop' button on this kind of clusters.
>> >>
>> >> Any concerns or other proposals?
>> >>
>> >>
>> >>
>> __________________________________________________________________________
>> >> OpenStack Development Mailing List (not for usage questions)
>> >> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >>
>> >
>> >
>> > --
>> > Best regards,
>> > Bogdan Dobrelya,
>> > Irc #bogdando
>> >
>> >
>> __________________________________________________________________________
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160123/d5011c47/attachment.html>


More information about the OpenStack-dev mailing list