[openstack-dev] [nova][vmware][ironic] Configuring active/passive HA Nova compute
Joe Gordon
joe.gordon0 at gmail.com
Wed Feb 25 20:18:19 UTC 2015
On Fri, Feb 20, 2015 at 3:48 AM, Matthew Booth <mbooth at redhat.com> wrote:
> Gary Kotton came across a doozy of a bug recently:
>
> https://bugs.launchpad.net/nova/+bug/1419785
>
> In short, when you start a Nova compute, it will query the driver for
> instances and compare that against the expected host of the the instance
> according to the DB. If the driver is reporting an instance the DB
> thinks is on a different host, it assumes the instance was evacuated
> while Nova compute was down, and deletes it on the hypervisor. However,
> Gary found that you trigger this when starting up a backup HA node which
> has a different `host` config setting. i.e. You fail over, and the first
> thing it does is delete all your instances.
>
> Gary and I both agree on a couple of things:
>
> 1. Deleting all your instances is bad
> 2. HA nova compute is highly desirable for some drivers
>
There is a deeper issue here, that we are trying to work around. Nova was
never designed to have entire systems running behind a nova-compute. It was
designed to have one nova-compute per 'physical box that runs instances'
There have been many discussions in the past on how to fix this issue (by
adding a new point in nova where clustered systems can plug in), but if I
remember correctly the gotcha was no one was willing to step up to do it.
>
> We disagree on the approach to fixing it, though. Gary posted this:
>
> https://review.openstack.org/#/c/154029/
>
> I've already outlined my objections to this approach elsewhere, but to
> summarise I think this fixes 1 symptom of a design problem, and leaves
> the rest untouched. If the value of nova compute's `host` changes, then
> the assumption that instances associated with that compute can be
> identified by the value of instance.host becomes invalid. This
> assumption is pervasive, so it breaks a lot of stuff. The worst one is
> _destroy_evacuated_instances(), which Gary found, but if you scan
> nova/compute/manager for the string 'self.host' you'll find lots of
> them. For example, all the periodic tasks are broken, including image
> cache management, and the state of ResourceTracker will be unusual.
> Worse, whenever a new instance is created it will have a different value
> of instance.host, so instances running on a single hypervisor will
> become partitioned based on which nova compute was used to create them.
>
> In short, the system may appear to function superficially, but it's
> unsupportable.
>
> I had an alternative idea. The current assumption is that the `host`
> managing a single hypervisor never changes. If we break that assumption,
> we break Nova, so we could assert it at startup and refuse to start if
> it's violated. I posted this VMware-specific POC:
>
> https://review.openstack.org/#/c/154907/
>
> However, I think I've had a better idea. Nova creates ComputeNode
> objects for its current configuration at startup which, amongst other
> things, are a map of host:hypervisor_hostname. We could assert when
> creating a ComputeNode that hypervisor_hostname is not already
> associated with a different host, and refuse to start if it is. We would
> give an appropriate error message explaining that this is a
> misconfiguration. This would prevent the user from hitting any of the
> associated problems, including the deletion of all their instances.
>
> We can still do active/passive HA!
>
> If we configure both nodes in the active/passive cluster identically,
> including with the same value of `host`, I don't see why this shouldn't
> work today. I don't even think the configuration is onerous. All we
> would be doing is preventing the user from accidentally running a
> misconfigured HA which leads to inconsistent state, and will eventually
> require manual cleanup.
>
> We would still have to be careful that we don't bring up both nova
> computes simultaneously. The VMware driver, at least, has hardcoded
> assumptions that it is the only writer in certain circumstances. That
> problem would have to be handled separately, perhaps at the messaging
> layer.
>
> Matt
> --
> Matthew Booth
> Red Hat Engineering, Virtualisation Team
>
> Phone: +442070094448 (UK)
> GPG ID: D33C3490
> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150225/583ac1bd/attachment.html>
More information about the OpenStack-dev
mailing list