[openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard

Artom Lifshitz alifshit at redhat.com
Mon Jun 18 14:16:05 UTC 2018


Hey all,

For Rocky I'm trying to get live migration to work properly for
instances that have a NUMA topology [1].

A question that came up on one of patches [2] is how to handle
resources claims on the destination, or indeed whether to handle that
at all.

The previous attempt's approach [3] (call it A) was to use the
resource tracker. This is race-free and the "correct" way to do it,
but the code is pretty opaque and not easily reviewable, as evidenced
by [3] sitting in review purgatory for literally years.

A simpler approach (call it B) is to ignore resource claims entirely
for now and wait for NUMA in placement to land in order to handle it
that way. This is obviously race-prone and not the "correct" way of
doing it, but the code would be relatively easy to review.

For the longest time, live migration did not keep track of resources
(until it started updating placement allocations). The message to
operators was essentially "we're giving you this massive hammer, don't
break your fingers." Continuing to ignore resource claims for now is
just maintaining the status quo. In addition, there is value in
improving NUMA live migration *now*, even if the improvement is
incomplete because it's missing resource claims. "Best is the enemy of
good" and all that. Finally, making use of the resource tracker is
just work that we know will get thrown out once we start using
placement for NUMA resources.

For all those reasons, I would favor approach B, but I wanted to ask
the community for their thoughts.

Thanks!

[1] https://review.openstack.org/#/q/topic:bp/numa-aware-live-migration+(status:open+OR+status:merged)
[2] https://review.openstack.org/#/c/567242/
[3] https://review.openstack.org/#/c/244489/



More information about the OpenStack-dev mailing list