[openstack-dev] [nova] Supporting force live-migrate and force evacuate with nested allocations

Balázs Gibizer balazs.gibizer at ericsson.com
Wed Oct 10 10:32:37 UTC 2018


Thanks for all the feedback. I feel the following consensus is forming:

1) remove the force flag in a new microversion. I've proposed a spec 
about that API change [1]

2) in the old microversions change the blind allocation copy to gather 
every resource from a nested source RPs too and try to allocate that 
from the destination root RP. In nested allocation cases putting this 
allocation to placement will fail and nova will fail the migration / 
evacuation. However it will succeed if the server does not need nested 
allocation neither on the source nor on the destination host (a.k.a the 
legacy case). Or if the server has nested allocation on the source host 
but does not need nested allocation on the destination host (for 
example the dest host does not have nested RP tree yet).

I will start implementing #2) as part of the 
use-nested-allocation-candidate bp soon and will continue with #1) 
later in the cycle.

Nothing is set in stone yet so feedback is still very appreciated.


[1] https://review.openstack.org/#/c/609330/

On Tue, Oct 9, 2018 at 11:40 AM, Balázs Gibizer 
<balazs.gibizer at ericsson.com> wrote:
> Hi,
> Setup
> -----
> nested allocation: an allocation that contains resources from one or 
> more nested RPs. (if you have better term for this then please 
> suggest).
> If an instance has nested allocation it means that the compute, it 
> allocates from, has a nested RP tree. BUT if a compute has a nested 
> RP tree it does not automatically means that the instance, allocating 
> from that compute, has a nested allocation (e.g. bandwidth inventory 
> will be on a nested RPs but not every instance will require bandwidth)
> Afaiu, as soon as we have NUMA modelling in place the most trivial 
> servers will have nested allocations as CPU and MEMORY inverntory 
> will be moved to the nested NUMA RPs. But NUMA is still in the future.
> Sidenote: there is an edge case reported by bauzas when an instance 
> allocates _only_ from nested RPs. This was discussed on last Friday 
> and it resulted in a new patch[0] but I would like to keep that 
> discussion separate from this if possible.
> Sidenote: the current problem somewhat related to not just nested PRs 
> but to sharing RPs as well. However I'm not aiming to implement 
> sharing support in Nova right now so I also try to keep the sharing 
> disscussion separated if possible.
> There was already some discussion on the Monday's scheduler meeting 
> but I could not attend.
> http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-08-14.00.log.html#l-20
> The meat
> --------
> Both live-migrate[1] and evacuate[2] has an optional force flag on 
> the nova REST API. The documentation says: "Force <the action> by not 
> verifying the provided destination host by the scheduler."
> Nova implements this statement by not calling the scheduler if 
> force=True BUT still try to manage allocations in placement.
> To have allocation on the destination host Nova blindly copies the 
> instance allocation from the source host to the destination host 
> during these operations. Nova can do that as 1) the whole allocation 
> is against a single RP (the compute RP) and 2) Nova knows both the 
> source compute RP and the destination compute RP.
> However as soon as we bring nested allocations into the picture that 
> blind copy will not be feasible. Possible cases
> 0) The instance has non-nested allocation on the source and would 
> need non nested allocation on the destination. This works with blindy 
> copy today.
> 1) The instance has a nested allocation on the source and would need 
> a nested allocation on the destination as well.
> 2) The instance has a non-nested allocation on the source and would 
> need a nested allocation on the destination.
> 3) The instance has a nested allocation on the source and would need 
> a non nested allocation on the destination.
> Nova cannot generate nested allocations easily without reimplementing 
> some of the placement allocation candidate (a_c) code. However I 
> don't like the idea of duplicating some of the a_c code in Nova.
> Nova cannot detect what kind of allocation (nested or non-nested) an 
> instance would need on the destination without calling placement a_c. 
> So knowing when to call placement is a chicken and egg problem.
> Possible solutions:
> A) fail fast
> ------------
> 0) Nova can detect that the source allocatioin is non-nested and try 
> the blindy copy and it will succeed.
> 1) Nova can detect that the source allocaton is nested and fail the 
> operation
> 2) Nova only sees a non nested source allocation. Even if the dest RP 
> tree is nested it does not mean that the allocation will be nested. 
> We cannot fail fast. Nova can try the blind copy and allocate every 
> resources from the root RP of the destination. If the instance 
> require nested allocation instead the claim will fail in placement. 
> So nova can fail the operation a bit later than in 1).
> 3) Nova can detect that the source allocation is nested and fail the 
> operation. However and enhanced blind copy that tries to allocation 
> everything from the root RP on the destinaton would have worked.
> B) Guess when to ignore the force flag and call the scheduler
> -------------------------------------------------------------
> 0) keep the blind copy as it works
> 1) Nova detect that the source allocation is nested. Ignores the 
> force flag and calls the scheduler that will call placement a_c. Move 
> operation can succeed.
> 2) Nova only sees a non nested source allocation so it will fall back 
> to blind copy and fails at the claim on destination.
> 3) Nova detect that the source allocation is nested. Ignores the 
> force flag and calls the scheduler that will call placement a_c. Move 
> operation can succeed.
> This solution would be against the API doc that states nova does not 
> call the scheduler if the operation is forced. However in case of 
> force live-migration Nova already verifies the target host from 
> couple of perspective in [3].
> This solution is alreay proposed for live-migrate in [4] and for 
> evacuate in [5] so the complexity of the solution can be seen in the 
> reviews.
> C) Remove the force flag from the API in a new microversion
> -----------------------------------------------------------
> 0)-3): all cases would call the scheduler to verify the target host 
> and generate the nested (or non-nested) allocation.
> We would still need an agreed behavior (from A), B), D)) for the old 
> microversions as the todays code creates inconsistent allocation in 
> #1) and #3) by ignoring the resource from the nested RP.
> D) Do not manage allocations in placement for forced operation
> --------------------------------------------------------------
> Force flag is considered as a last resort tool for the admin to move 
> VMs around. The API doc has a fat warning about the danger of it. So 
> Nova can simply ignore resource allocation task if force=True. Nova 
> would delete the source allocation and does not create any allocation 
> on the destination host.
> This is a simple but dangerous solution but it is what the force flag 
> is all about, move the server against all the built in safeties. (If 
> the admin needs the safeties she can set force=False and still 
> specify the destination host)
> I'm open to any suggestions.
> Cheers,
> gibi
> [0] https://review.openstack.org/#/c/608298/
> [1] 
> https://developer.openstack.org/api-ref/compute/#live-migrate-server-os-migratelive-action
> [2] 
> https://developer.openstack.org/api-ref/compute/#evacuate-server-evacuate-action
> [3] 
> https://github.com/openstack/nova/blob/c5a7002bd571379818c0108296041d12bc171728/nova/conductor/tasks/live_migrate.py#L97
> [4] https://review.openstack.org/#/c/605785
> [5] https://review.openstack.org/#/c/606111

More information about the OpenStack-dev mailing list