[openstack-dev] [nova] Supporting force live-migrate and force evacuate with nested allocations
Sylvain Bauza
sylvain.bauza at gmail.com
Wed Oct 10 11:57:32 UTC 2018
Le mer. 10 oct. 2018 à 12:32, Balázs Gibizer <balazs.gibizer at ericsson.com>
a écrit :
> Hi,
>
> Thanks for all the feedback. I feel the following consensus is forming:
>
> 1) remove the force flag in a new microversion. I've proposed a spec
> about that API change [1]
>
> Thanks, will look at it.
> 2) in the old microversions change the blind allocation copy to gather
> every resource from a nested source RPs too and try to allocate that
> from the destination root RP. In nested allocation cases putting this
> allocation to placement will fail and nova will fail the migration /
> evacuation. However it will succeed if the server does not need nested
> allocation neither on the source nor on the destination host (a.k.a the
> legacy case). Or if the server has nested allocation on the source host
> but does not need nested allocation on the destination host (for
> example the dest host does not have nested RP tree yet).
>
>
Cool with me.
> I will start implementing #2) as part of the
> use-nested-allocation-candidate bp soon and will continue with #1)
> later in the cycle.
>
> Nothing is set in stone yet so feedback is still very appreciated.
>
> Cheers,
> gibi
>
> [1] https://review.openstack.org/#/c/609330/
>
> On Tue, Oct 9, 2018 at 11:40 AM, Balázs Gibizer
> <balazs.gibizer at ericsson.com> wrote:
> > Hi,
> >
> > Setup
> > -----
> >
> > nested allocation: an allocation that contains resources from one or
> > more nested RPs. (if you have better term for this then please
> > suggest).
> >
> > If an instance has nested allocation it means that the compute, it
> > allocates from, has a nested RP tree. BUT if a compute has a nested
> > RP tree it does not automatically means that the instance, allocating
> > from that compute, has a nested allocation (e.g. bandwidth inventory
> > will be on a nested RPs but not every instance will require bandwidth)
> >
> > Afaiu, as soon as we have NUMA modelling in place the most trivial
> > servers will have nested allocations as CPU and MEMORY inverntory
> > will be moved to the nested NUMA RPs. But NUMA is still in the future.
> >
> > Sidenote: there is an edge case reported by bauzas when an instance
> > allocates _only_ from nested RPs. This was discussed on last Friday
> > and it resulted in a new patch[0] but I would like to keep that
> > discussion separate from this if possible.
> >
> > Sidenote: the current problem somewhat related to not just nested PRs
> > but to sharing RPs as well. However I'm not aiming to implement
> > sharing support in Nova right now so I also try to keep the sharing
> > disscussion separated if possible.
> >
> > There was already some discussion on the Monday's scheduler meeting
> > but I could not attend.
> >
> http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-08-14.00.log.html#l-20
> >
> >
> > The meat
> > --------
> >
> > Both live-migrate[1] and evacuate[2] has an optional force flag on
> > the nova REST API. The documentation says: "Force <the action> by not
> > verifying the provided destination host by the scheduler."
> >
> > Nova implements this statement by not calling the scheduler if
> > force=True BUT still try to manage allocations in placement.
> >
> > To have allocation on the destination host Nova blindly copies the
> > instance allocation from the source host to the destination host
> > during these operations. Nova can do that as 1) the whole allocation
> > is against a single RP (the compute RP) and 2) Nova knows both the
> > source compute RP and the destination compute RP.
> >
> > However as soon as we bring nested allocations into the picture that
> > blind copy will not be feasible. Possible cases
> > 0) The instance has non-nested allocation on the source and would
> > need non nested allocation on the destination. This works with blindy
> > copy today.
> > 1) The instance has a nested allocation on the source and would need
> > a nested allocation on the destination as well.
> > 2) The instance has a non-nested allocation on the source and would
> > need a nested allocation on the destination.
> > 3) The instance has a nested allocation on the source and would need
> > a non nested allocation on the destination.
> >
> > Nova cannot generate nested allocations easily without reimplementing
> > some of the placement allocation candidate (a_c) code. However I
> > don't like the idea of duplicating some of the a_c code in Nova.
> >
> > Nova cannot detect what kind of allocation (nested or non-nested) an
> > instance would need on the destination without calling placement a_c.
> > So knowing when to call placement is a chicken and egg problem.
> >
> > Possible solutions:
> > A) fail fast
> > ------------
> > 0) Nova can detect that the source allocatioin is non-nested and try
> > the blindy copy and it will succeed.
> > 1) Nova can detect that the source allocaton is nested and fail the
> > operation
> > 2) Nova only sees a non nested source allocation. Even if the dest RP
> > tree is nested it does not mean that the allocation will be nested.
> > We cannot fail fast. Nova can try the blind copy and allocate every
> > resources from the root RP of the destination. If the instance
> > require nested allocation instead the claim will fail in placement.
> > So nova can fail the operation a bit later than in 1).
> > 3) Nova can detect that the source allocation is nested and fail the
> > operation. However and enhanced blind copy that tries to allocation
> > everything from the root RP on the destinaton would have worked.
> >
> > B) Guess when to ignore the force flag and call the scheduler
> > -------------------------------------------------------------
> > 0) keep the blind copy as it works
> > 1) Nova detect that the source allocation is nested. Ignores the
> > force flag and calls the scheduler that will call placement a_c. Move
> > operation can succeed.
> > 2) Nova only sees a non nested source allocation so it will fall back
> > to blind copy and fails at the claim on destination.
> > 3) Nova detect that the source allocation is nested. Ignores the
> > force flag and calls the scheduler that will call placement a_c. Move
> > operation can succeed.
> >
> > This solution would be against the API doc that states nova does not
> > call the scheduler if the operation is forced. However in case of
> > force live-migration Nova already verifies the target host from
> > couple of perspective in [3].
> > This solution is alreay proposed for live-migrate in [4] and for
> > evacuate in [5] so the complexity of the solution can be seen in the
> > reviews.
> >
> > C) Remove the force flag from the API in a new microversion
> > -----------------------------------------------------------
> > 0)-3): all cases would call the scheduler to verify the target host
> > and generate the nested (or non-nested) allocation.
> > We would still need an agreed behavior (from A), B), D)) for the old
> > microversions as the todays code creates inconsistent allocation in
> > #1) and #3) by ignoring the resource from the nested RP.
> >
> > D) Do not manage allocations in placement for forced operation
> > --------------------------------------------------------------
> > Force flag is considered as a last resort tool for the admin to move
> > VMs around. The API doc has a fat warning about the danger of it. So
> > Nova can simply ignore resource allocation task if force=True. Nova
> > would delete the source allocation and does not create any allocation
> > on the destination host.
> >
> > This is a simple but dangerous solution but it is what the force flag
> > is all about, move the server against all the built in safeties. (If
> > the admin needs the safeties she can set force=False and still
> > specify the destination host)
> >
> > I'm open to any suggestions.
> >
> > Cheers,
> > gibi
> >
> > [0] https://review.openstack.org/#/c/608298/
> > [1]
> >
> https://developer.openstack.org/api-ref/compute/#live-migrate-server-os-migratelive-action
> > [2]
> >
> https://developer.openstack.org/api-ref/compute/#evacuate-server-evacuate-action
> > [3]
> >
> https://github.com/openstack/nova/blob/c5a7002bd571379818c0108296041d12bc171728/nova/conductor/tasks/live_migrate.py#L97
> > [4] https://review.openstack.org/#/c/605785
> > [5] https://review.openstack.org/#/c/606111
> >
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181010/e5f5666d/attachment.html>
More information about the OpenStack-dev
mailing list