[openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

Sergii Golovatiuk sgolovatiuk at mirantis.com
Tue Sep 9 12:11:23 UTC 2014


Hi Fuelers,

1. Sometimes fuel has non reversible changes. Here are a couple of samples
A new version needs to change/adjust Pacemaker primitives. Such changes
affect all controllers in cluster.
A old API can be deprecated or new API can be introduced. Until we all
components configured to use new API, it's almost impossible to keep half
of cluster with old API and half cluster with new API.

2. For computes, even if we stop services VM instances should work. I think
it's possible to upgrade without downtime of VM instances. Though I am not
sure if it's possible for CEPH nodes.




--
Best regards,
Sergii Golovatiuk,
Skype #golserge
IRC #holser

On Tue, Sep 9, 2014 at 9:35 AM, Mike Scherbakov <mscherbakov at mirantis.com>
wrote:

> Hi all,
> please see below original email below from Dmitry. I've modified the
> subject to bring larger audience to the issue.
>
> I'd like to split the issue into two parts:
>
>    1. Maintenance mode for OpenStack controllers in HA mode (HA-ed
>    Keystone, Glance, etc.)
>    2. Maintenance mode for OpenStack computes/storage nodes (no HA)
>
> For first category, we might not need to have maintenance mode at all. For
> example, if we apply patching/upgrade one by one node to 3-node HA cluster,
> 2 nodes will serve requests normally. Is that possible for our HA solutions
> in Fuel, TripleO, other frameworks?
>
> For second category, can not we simply do "nova-manage service
> disable...", so scheduler will simply stop scheduling new workloads on
> particular host which we want to do maintenance on?
>
>
> On Thu, Aug 28, 2014 at 6:44 PM, Dmitry Pyzhov <dpyzhov at mirantis.com>
> wrote:
>
>> All,
>>
>> I'm not sure if it deserves to be mentioned in our documentation, this
>> seems to be a common practice. If an administrator wants to patch his
>> environment, he should be prepared for a temporary downtime of OpenStack
>> services. And he should plan to perform patching in advance: choose a time
>> with minimal load and warn users about possible interruptions of service
>> availability.
>>
>> Our current implementation of patching does not protect from downtime
>> during the patching procedure. HA deployments seems to be more or less
>> stable. But it looks like it is possible to schedule an action on a compute
>> node and get an error because of service restart. Deployments with one
>> controller... well, you won’t be able to use your cluster until the
>> patching is finished. There is no way to get rid of downtime here.
>>
>> As I understand, we can get rid of possible issues with computes in HA.
>> But it will require migration of instances and stopping of nova-compute
>> service before patching. And it will make the overall patching procedure
>> much longer. Do we want to investigate this process?
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Mike Scherbakov
> #mihgen
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140909/c19d72a5/attachment.html>


More information about the OpenStack-dev mailing list