[openstack-dev] [nova][vmware][ironic] Configuring active/passive HA Nova compute
Matthew Booth
mbooth at redhat.com
Wed Feb 25 14:34:07 UTC 2015
On 25/02/15 11:51, Radoslav Gerganov wrote:
> On 02/23/2015 03:18 PM, Matthew Booth wrote:
>> On 23/02/15 12:13, Gary Kotton wrote:
>>>
>>>
>>> On 2/23/15, 2:05 PM, "Matthew Booth" <mbooth at redhat.com> wrote:
>>>
>>>> On 20/02/15 11:48, Matthew Booth wrote:
>>>>> Gary Kotton came across a doozy of a bug recently:
>>>>>
>>>>> https://bugs.launchpad.net/nova/+bug/1419785
>>>>>
>>>>> In short, when you start a Nova compute, it will query the driver for
>>>>> instances and compare that against the expected host of the the
>>>>> instance
>>>>> according to the DB. If the driver is reporting an instance the DB
>>>>> thinks is on a different host, it assumes the instance was evacuated
>>>>> while Nova compute was down, and deletes it on the hypervisor.
>>>>> However,
>>>>> Gary found that you trigger this when starting up a backup HA node
>>>>> which
>>>>> has a different `host` config setting. i.e. You fail over, and the
>>>>> first
>>>>> thing it does is delete all your instances.
>>>>>
>>>>> Gary and I both agree on a couple of things:
>>>>>
>>>>> 1. Deleting all your instances is bad
>>>>> 2. HA nova compute is highly desirable for some drivers
>>>>>
>>>>> We disagree on the approach to fixing it, though. Gary posted this:
>>>>>
>>>>> https://review.openstack.org/#/c/154029/
>>>>>
>>>>> I've already outlined my objections to this approach elsewhere, but to
>>>>> summarise I think this fixes 1 symptom of a design problem, and leaves
>>>>> the rest untouched. If the value of nova compute's `host` changes,
>>>>> then
>>>>> the assumption that instances associated with that compute can be
>>>>> identified by the value of instance.host becomes invalid. This
>>>>> assumption is pervasive, so it breaks a lot of stuff. The worst one is
>>>>> _destroy_evacuated_instances(), which Gary found, but if you scan
>>>>> nova/compute/manager for the string 'self.host' you'll find lots of
>>>>> them. For example, all the periodic tasks are broken, including image
>>>>> cache management, and the state of ResourceTracker will be unusual.
>>>>> Worse, whenever a new instance is created it will have a different
>>>>> value
>>>>> of instance.host, so instances running on a single hypervisor will
>>>>> become partitioned based on which nova compute was used to create
>>>>> them.
>>>>>
>>>>> In short, the system may appear to function superficially, but it's
>>>>> unsupportable.
>>>>>
>>>>> I had an alternative idea. The current assumption is that the `host`
>>>>> managing a single hypervisor never changes. If we break that
>>>>> assumption,
>>>>> we break Nova, so we could assert it at startup and refuse to start if
>>>>> it's violated. I posted this VMware-specific POC:
>>>>>
>>>>> https://review.openstack.org/#/c/154907/
>>>>>
>>>>> However, I think I've had a better idea. Nova creates ComputeNode
>>>>> objects for its current configuration at startup which, amongst other
>>>>> things, are a map of host:hypervisor_hostname. We could assert when
>>>>> creating a ComputeNode that hypervisor_hostname is not already
>>>>> associated with a different host, and refuse to start if it is. We
>>>>> would
>>>>> give an appropriate error message explaining that this is a
>>>>> misconfiguration. This would prevent the user from hitting any of the
>>>>> associated problems, including the deletion of all their instances.
>>>>
>>>> I have posted a patch implementing the above for review here:
>>>>
>>>> https://review.openstack.org/#/c/158269/
>>>
>>> I have to look at what you have posted. I think that this topic is
>>> something that we should speak about at the summit and this should fall
>>> under some BP and well defined spec. I really would not like to see
>>> existing installations being broken if and when this patch lands. It may
>>> also affect Ironic as it works on the same model.
>>
>> This patch will only affect installations configured with multiple
>> compute hosts for a single hypervisor. These are already broken, so this
>> patch will at least let them know if they haven't already noticed.
>>
>> It won't affect Ironic, because they configure all compute hosts to have
>> the same 'host' value. An Ironic user would only notice this patch if
>> they accidentally misconfigured it, which is the intended behaviour.
>>
>> Incidentally, I also support more focus on the design here. Until we
>> come up with a better design, though, we need to do our best to prevent
>> non-trivial corruption from a trivial misconfiguration. I think we need
>> to merge this, or something like it, now and still have a summit
>> discussion.
>>
>> Matt
>>
>
> Hi Matt,
>
> I already posted a comment on your patch but I'd like to reiterate here
> as well. Currently the VMware driver is using the cluster name as
> hypervisor_hostname which is a problem because you can have different
> clusters with the same name. We already have a critical bug filed for
> this:
>
> https://bugs.launchpad.net/nova/+bug/1329261
>
> There was an attempt to fix this by using a combination of vCenter UUID
> + cluster_name but it was rejected because this combination was not
> considered a 'real' hostname. I think that if we go for a DB schema
> change we can fix both issues by renaming hypervisor_hostname to
> hypervisor_id and make it unique. What do you think?
Well, I think hypervisor_id makes more sense than hypervisor_hostname.
The latter is a confusing. However, I'd prefer not to complicate this
change with it. I'm pessimistic enough as it is with its current scope.
Re the cluster name change, I assume you're referring to this change:
https://review.openstack.org/#/c/99623/
I have to say I don't agree with the reasoning behind the rejection. The
only thing the API-layer is going to be able to do is check if it's got
dots in it, and it would only pass for Ironic's uuids coincidentally.
Still, it's a trivial change to that patch to make it pass, so we should
just do it.
I think this issue is orthogonal to my patch, though, because it's
already unintentional and broken.
Matt
--
Matthew Booth
Red Hat Engineering, Virtualisation Team
Phone: +442070094448 (UK)
GPG ID: D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
More information about the OpenStack-dev
mailing list