Open Stack

Thu Jun 21 08:22:04 UTC 2018

On Mon, Jun 18, 2018 at 10:16:05AM -0400, Artom Lifshitz wrote:
> Hey all,
> 
> For Rocky I'm trying to get live migration to work properly for
> instances that have a NUMA topology [1].
> 
> A question that came up on one of patches [2] is how to handle
> resources claims on the destination, or indeed whether to handle that
> at all.
> 
> The previous attempt's approach [3] (call it A) was to use the
> resource tracker. This is race-free and the "correct" way to do it,
> but the code is pretty opaque and not easily reviewable, as evidenced
> by [3] sitting in review purgatory for literally years.
> 
> A simpler approach (call it B) is to ignore resource claims entirely
> for now and wait for NUMA in placement to land in order to handle it
> that way. This is obviously race-prone and not the "correct" way of
> doing it, but the code would be relatively easy to review.

Hello Artom, The problem I have with B approach is that. It's based on
something which has not been designed for which will end-up with the
same bugs that you are trying to solve (1417667, 1289064).

The live migration is a sensitive operation that operators need to
have trust on, if we take case of a host evacuation the result would
be terrible, no?

If you want continue with B, I think you will have to find at least a
mechanism to update the host NUMA topology resources of the
destination during the on-going migrations. But again that should be
done early to avoid a too big window where an other instance can be
scheduled and be assigned of the same CPU topology. Also does this
really make sense when we now that at some point placement will take
care of such things for NUMA resources?

The A approach already handles what you need:

- Test whether destination host can accept the guest CPU policy
- Build new instance NUMA topology based on destination host
- Hold and update NUMA topology resources of destination host
- Store the destination host NUMA topology so it can be used by source
...

My preference is A because it reuses something which is used for every
guests that are scheduled today (not only for pci or numa things), we
have trust on it, it's also used for some move operations, it limits
the race window to a one we already have, and finally we limit the
code introduced.

Thanks,
s.

> For the longest time, live migration did not keep track of resources
> (until it started updating placement allocations). The message to
> operators was essentially "we're giving you this massive hammer, don't
> break your fingers." Continuing to ignore resource claims for now is
> just maintaining the status quo. In addition, there is value in
> improving NUMA live migration *now*, even if the improvement is
> incomplete because it's missing resource claims. "Best is the enemy of
> good" and all that. Finally, making use of the resource tracker is
> just work that we know will get thrown out once we start using
> placement for NUMA resources.
> 
> For all those reasons, I would favor approach B, but I wanted to ask
> the community for their thoughts.
> 
> Thanks!
> 
> [1] https://review.openstack.org/#/q/topic:bp/numa-aware-live-migration+(status:open+OR+status:merged)
> [2] https://review.openstack.org/#/c/567242/
> [3] https://review.openstack.org/#/c/244489/
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Open Stack

[openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard

OpenStack

Community

Documentation

Branding & Legal