Open Stack

Mon Oct 12 10:42:09 UTC 2015

On 10/06/2015 04:34 PM, Matthew Booth wrote:
> Hi, Roman,
> 
> Evacuated has been on my radar for a while and this post has prodded me
> to take a look at the code. I think it's worth starting by explaining
> the problems in the current solution. Nova client is currently
> responsible for doing this evacuate. It does:
>

</snipping a lot of reasonable text>

> 
> I believe we can solve this problem, but I think that without fixing
> single-instance evacuate we're just pushing the problem around (or
> creating new places for it to live). I would base the robustness of my
> implementation on a single principal:
> 
>   An instance has a single owner, which is exclusively responsible for
> rebuilding it.
> 
> In outline, I would redefine the evacuate process to do:
> 
> API:
> 1. Call the scheduler to get a destination for the evacuate if none was
> given.
> 2. Atomically update instance.host to this destination, and task state
> to rebuilding.
> 

We can't do this because of resource tracking - the host switch has to
be done after the claim is done which can happen only on the target
compute, otherwise we don't track the resources properly (*).

That does not invalidate your more general point which is that we need a
way to make sure that started evacuations can be picked up and resumed
in case of any failures along the way (even a rebuild failure of the
target host that may have failed during the process).

Some work that dansmith did [1] and I later built upon some of that work
[2]. I think our assumption was that we would use the migration record
for this, which _I think_ gives us all the stuff you talk about further
below, apart of course from there being a need for an external task to
actually see the evacuation through to the end. I think this is in-line
with most HA design proposals, where we make sure our control plane is
redundant while we really don't care about individual compute nodes
(apart from the instances they host).

I am also not sure that leaving the actual building of the instance up
to a periodic task is a good choice if we want to minimize downtime
which seem to me to be the point of the instance HA proposals.

N.

(*) We could "solve" this by checkin instance.task_state for example but
IMHO we shouldn't go down that route as it becomes way more difficult to
reason about resource tracking once you introduce one more free-variable.

[1]
https://github.com/openstack/nova/blob/02b7e64b29dd707c637ea7026d337e5cb196f337/nova/compute/api.py#L3303
[2]
https://github.com/openstack/nova/blob/02b7e64b29dd707c637ea7026d337e5cb196f337/nova/compute/manager.py#L2702

> Compute:
> 3. Rebuild the instance.
> 
> This would be supported by a periodic task on the compute host which
> looks for rebuilding instances assigned to this host which aren't
> currently rebuilding, and kicks off a rebuild for them. This would cover
> the compute going down during a rebuild, or the api going down before
> messaging the compute.
> 
> Implementing this gives us several things:
> 
> 1. The list instances, evacuate all instances process becomes
> idempotent, because as soon as the evacuate is initiated, the instance
> is removed from the source host.
> 2. We get automatic recovery of failure of the target compute. Because
> we atomically moved the instance to the target compute immediately, if
> the target compute also has to be evacuated, our instance won't fall
> through the gap.
> 3. We don't need an additional place for the code to run, because it
> will run on the compute. All the work has to be done by the compute
> anyway. By farming the evacuates out directly and immediately to the
> target compute we reduce both overhead and complexity.
> 
> The coordination becomes very simple. If we've run the nova client
> evacuation anywhere at least once, the actual evacuations are now
> Sombody Else's Problem (to quote h2g2), and will complete eventually. As
> evacuation in any case involves a forced change of owner it requires
> fencing of the source and implies an external agent such as pacemaker.
> The nova client evacuation can run in pacemaker.
> 
> Matt
> 
> On Fri, Oct 2, 2015 at 2:05 PM, Roman Dobosz <roman.dobosz at intel.com
> <mailto:roman.dobosz at intel.com>> wrote:
> 
>     Hi all,
> 
>     The case of automatic evacuation (or resurrection currently), is a topic
>     which surfaces once in a while, but it isn't yet fully supported by
>     OpenStack and/or by the cluster services. There was some attempts to
>     bring the feature into OpenStack, however it turns out it cannot be
>     easily integrated with. On the other hand evacuation may be executed
>     from the outside using Nova client or Nova API calls for evacuation
>     initiation.
> 
>     I did some research regarding the ways how it could be designed, based
>     on Russel Bryant blog post[1] as a starting point. Apart from it, I've
>     also taken high availability and reliability into consideration when
>     designing the solution.
> 
>     Together with coworker, we did first PoC[2] to enable cluster to be able
>     to perform evacuation. The idea behind that PoC was simple - providing
>     additional, small service which would trigger and supervise the
>     evacuation process, which would be triggered from the outside (in this
>     example we were using Pacemaker fencing facility, but it might be
>     anything) using RabbitMQ directly. Those services are running on the
>     control plane in AA fashion.
> 
>     That work well for us. So we started exploring other possibilities like
>     oslo.messaging just to use it in the same manner as we did in the poc.
>     It turns out that the implementation will not be as easy, because there
>     is no facility in the oslo.messaging for letting sending an ACK from the
>     client after the job is done (not as soon as it gets the message). We
>     also looked at the existing OpenStack projects for a candidate which
>     provide service for managing long running tasks.
> 
>     There is the Mistral project, which gives us almost all the features we
>     need. The one missing feature is the HA of the Mistral tasks execution.
> 
>     The question is, how such problem (long running tasks) could be resolved
>     in OpenStack?
> 
>     [1]
>     http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/
>     [2] https://github.com/dawiddeja/evacuationd
> 
>     --
>     Cheers,
>     Roman Dobosz
> 
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

Open Stack

[openstack-dev] [nova][mistral] Automatic evacuation as a long running task

OpenStack

Community

Documentation

Branding & Legal