[openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

Clint Byrum clint at fewbar.com
Tue Sep 9 14:02:35 UTC 2014


Excerpts from Mike Scherbakov's message of 2014-09-09 00:35:09 -0700:
> Hi all,
> please see below original email below from Dmitry. I've modified the
> subject to bring larger audience to the issue.
> 
> I'd like to split the issue into two parts:
> 
>    1. Maintenance mode for OpenStack controllers in HA mode (HA-ed
>    Keystone, Glance, etc.)
>    2. Maintenance mode for OpenStack computes/storage nodes (no HA)
> 
> For first category, we might not need to have maintenance mode at all. For
> example, if we apply patching/upgrade one by one node to 3-node HA cluster,
> 2 nodes will serve requests normally. Is that possible for our HA solutions
> in Fuel, TripleO, other frameworks?

You may have a broken cloud if you are pushing out an update that
requires a new schema. Some services are better than others about
handling old schemas, and can be upgraded before doing schema upgrades.
But most of the time you have to do at least a brief downtime:

 * turn off DB accessing services
 * update code
 * run db migration
 * turn on DB accessing services

It is for this very reason, I believe, that Turbo Hipster was added to
the gate, so that deployers running against the upstream master branches
can have a chance at performing these upgrades in a reasonable amount of
time.

> 
> For second category, can not we simply do "nova-manage service disable...",
> so scheduler will simply stop scheduling new workloads on particular host
> which we want to do maintenance on?
> 

You probably would want 'nova host-servers-migrate <host>' at that
point, assuming you have migration set up.

http://docs.openstack.org/user-guide/content/novaclient_commands.html

> On Thu, Aug 28, 2014 at 6:44 PM, Dmitry Pyzhov <dpyzhov at mirantis.com> wrote:
> 
> > All,
> >
> > I'm not sure if it deserves to be mentioned in our documentation, this
> > seems to be a common practice. If an administrator wants to patch his
> > environment, he should be prepared for a temporary downtime of OpenStack
> > services. And he should plan to perform patching in advance: choose a time
> > with minimal load and warn users about possible interruptions of service
> > availability.
> >
> > Our current implementation of patching does not protect from downtime
> > during the patching procedure. HA deployments seems to be more or less
> > stable. But it looks like it is possible to schedule an action on a compute
> > node and get an error because of service restart. Deployments with one
> > controller... well, you won’t be able to use your cluster until the
> > patching is finished. There is no way to get rid of downtime here.
> >
> > As I understand, we can get rid of possible issues with computes in HA.
> > But it will require migration of instances and stopping of nova-compute
> > service before patching. And it will make the overall patching procedure
> > much longer. Do we want to investigate this process?
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> 



More information about the OpenStack-dev mailing list