[openstack-dev] [nova][mistral] Automatic evacuation as a long running task
Nikola Đipanov
ndipanov at redhat.com
Mon Oct 12 10:42:09 UTC 2015
On 10/06/2015 04:34 PM, Matthew Booth wrote:
> Hi, Roman,
>
> Evacuated has been on my radar for a while and this post has prodded me
> to take a look at the code. I think it's worth starting by explaining
> the problems in the current solution. Nova client is currently
> responsible for doing this evacuate. It does:
>
</snipping a lot of reasonable text>
>
> I believe we can solve this problem, but I think that without fixing
> single-instance evacuate we're just pushing the problem around (or
> creating new places for it to live). I would base the robustness of my
> implementation on a single principal:
>
> An instance has a single owner, which is exclusively responsible for
> rebuilding it.
>
> In outline, I would redefine the evacuate process to do:
>
> API:
> 1. Call the scheduler to get a destination for the evacuate if none was
> given.
> 2. Atomically update instance.host to this destination, and task state
> to rebuilding.
>
We can't do this because of resource tracking - the host switch has to
be done after the claim is done which can happen only on the target
compute, otherwise we don't track the resources properly (*).
That does not invalidate your more general point which is that we need a
way to make sure that started evacuations can be picked up and resumed
in case of any failures along the way (even a rebuild failure of the
target host that may have failed during the process).
Some work that dansmith did [1] and I later built upon some of that work
[2]. I think our assumption was that we would use the migration record
for this, which _I think_ gives us all the stuff you talk about further
below, apart of course from there being a need for an external task to
actually see the evacuation through to the end. I think this is in-line
with most HA design proposals, where we make sure our control plane is
redundant while we really don't care about individual compute nodes
(apart from the instances they host).
I am also not sure that leaving the actual building of the instance up
to a periodic task is a good choice if we want to minimize downtime
which seem to me to be the point of the instance HA proposals.
N.
(*) We could "solve" this by checkin instance.task_state for example but
IMHO we shouldn't go down that route as it becomes way more difficult to
reason about resource tracking once you introduce one more free-variable.
[1]
https://github.com/openstack/nova/blob/02b7e64b29dd707c637ea7026d337e5cb196f337/nova/compute/api.py#L3303
[2]
https://github.com/openstack/nova/blob/02b7e64b29dd707c637ea7026d337e5cb196f337/nova/compute/manager.py#L2702
> Compute:
> 3. Rebuild the instance.
>
> This would be supported by a periodic task on the compute host which
> looks for rebuilding instances assigned to this host which aren't
> currently rebuilding, and kicks off a rebuild for them. This would cover
> the compute going down during a rebuild, or the api going down before
> messaging the compute.
>
> Implementing this gives us several things:
>
> 1. The list instances, evacuate all instances process becomes
> idempotent, because as soon as the evacuate is initiated, the instance
> is removed from the source host.
> 2. We get automatic recovery of failure of the target compute. Because
> we atomically moved the instance to the target compute immediately, if
> the target compute also has to be evacuated, our instance won't fall
> through the gap.
> 3. We don't need an additional place for the code to run, because it
> will run on the compute. All the work has to be done by the compute
> anyway. By farming the evacuates out directly and immediately to the
> target compute we reduce both overhead and complexity.
>
> The coordination becomes very simple. If we've run the nova client
> evacuation anywhere at least once, the actual evacuations are now
> Sombody Else's Problem (to quote h2g2), and will complete eventually. As
> evacuation in any case involves a forced change of owner it requires
> fencing of the source and implies an external agent such as pacemaker.
> The nova client evacuation can run in pacemaker.
>
> Matt
>
> On Fri, Oct 2, 2015 at 2:05 PM, Roman Dobosz <roman.dobosz at intel.com
> <mailto:roman.dobosz at intel.com>> wrote:
>
> Hi all,
>
> The case of automatic evacuation (or resurrection currently), is a topic
> which surfaces once in a while, but it isn't yet fully supported by
> OpenStack and/or by the cluster services. There was some attempts to
> bring the feature into OpenStack, however it turns out it cannot be
> easily integrated with. On the other hand evacuation may be executed
> from the outside using Nova client or Nova API calls for evacuation
> initiation.
>
> I did some research regarding the ways how it could be designed, based
> on Russel Bryant blog post[1] as a starting point. Apart from it, I've
> also taken high availability and reliability into consideration when
> designing the solution.
>
> Together with coworker, we did first PoC[2] to enable cluster to be able
> to perform evacuation. The idea behind that PoC was simple - providing
> additional, small service which would trigger and supervise the
> evacuation process, which would be triggered from the outside (in this
> example we were using Pacemaker fencing facility, but it might be
> anything) using RabbitMQ directly. Those services are running on the
> control plane in AA fashion.
>
> That work well for us. So we started exploring other possibilities like
> oslo.messaging just to use it in the same manner as we did in the poc.
> It turns out that the implementation will not be as easy, because there
> is no facility in the oslo.messaging for letting sending an ACK from the
> client after the job is done (not as soon as it gets the message). We
> also looked at the existing OpenStack projects for a candidate which
> provide service for managing long running tasks.
>
> There is the Mistral project, which gives us almost all the features we
> need. The one missing feature is the HA of the Mistral tasks execution.
>
> The question is, how such problem (long running tasks) could be resolved
> in OpenStack?
>
> [1]
> http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/
> [2] https://github.com/dawiddeja/evacuationd
>
> --
> Cheers,
> Roman Dobosz
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list