[openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

John Garbutt john at johngarbutt.com
Fri Oct 17 09:18:01 UTC 2014


On 17 October 2014 02:28, Matt Riedemann <mriedem at linux.vnet.ibm.com> wrote:
>
>
> On 10/16/2014 7:26 PM, Christopher Aedo wrote:
>>
>> On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
>> <mscherbakov at mirantis.com> wrote:
>>>>
>>>> On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum <clint at fewbar.com> wrote:
>>>
>>> The idea is not simply deny or hang requests from clients, but provide
>>> them
>>> "we are in maintenance mode, retry in X seconds"
>>>
>>>> You probably would want 'nova host-servers-migrate <host>'
>>>
>>> yeah for migrations - but as far as I understand, it doesn't help with
>>> disabling this host in scheduler - there is can be a chance that some
>>> workloads will be scheduled to the host.
>>
>>
>> Regarding putting a compute host in maintenance mode using "nova
>> host-update --maintenance enable", it looks like the blueprint and
>> associated commits were abandoned a year and a half ago:
>> https://blueprints.launchpad.net/nova/+spec/host-maintenance
>>
>> It seems that "nova service-disable <host> nova-compute" effectively
>> prevents the scheduler from trying to send new work there.  Is this
>> the best approach to use right now if you want to pull a compute host
>> out of an environment before migrating VMs off?
>>
>> I agree with Tim and Mike that having something respond "down for
>> maintenance" rather than ignore or hang would be really valuable.  But
>> it also looks like that hasn't gotten much traction in the past -
>> anyone feel like they'd be in support of reviving the notion of
>> "maintenance mode"?
>>
>> -Christopher
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> host-maintenance-mode is definitely a thing in nova compute via the os-hosts
> API extension and the --maintenance parameter, the compute manager code is
> here [1].  The thing is the only in-tree virt driver that implements it is
> xenapi, and I believe when you put the host in maintenance mode it's
> supposed to automatically evacuate the instances to some other host, but you
> can't target the other host or tell the driver, from the API, which
> instances you want to evacuate, e.g. all, none, running only, etc.
>
> [1]
> http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990

We should certainly make that more generic. It doesn't update the VM
state, so its really only admin focused in its current form.

The XenAPI logic only works when using XenServer pools with shared NFS
storage, if my memory serves me correctly. Honestly, its a bit of code
I have planned on removing, along with the rest of the pool support.

In terms of requiring DB downtime in Nova, the current efforts are
focusing on avoiding downtime all together, via expand/contract style
migrations, with a little help from objects to avoid data migrations.

That doesn't mean maintenance mode if not useful for other things,
like an emergency patching of the hypervisor.

John



More information about the OpenStack-dev mailing list