[openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard

Sylvain Bauza sbauza at redhat.com
Wed Jun 20 16:00:26 UTC 2018


On Tue, Jun 19, 2018 at 9:59 PM, Artom Lifshitz <alifshit at redhat.com> wrote:

> > Adding
> > claims support later on wouldn't change any on-the-wire messaging, it
> would
> > just make things work more robustly.
>
> I'm not even sure about that. Assuming [1] has at least the right
> idea, it looks like it's an either-or kind of thing: either we use
> resource tracker claims and get the new instance NUMA topology that
> way, or do what was in the spec and have the dest send it to the
> source.
>
> That being said, I still think I'm still in favor of choosing the
> "easy" way out. For instance, [2] should fail because we can't access
> the api db from the compute node. So unless there's a simpler way,
> using RT claims would involve changing the RPC to add parameters to
> check_can_live_migration_destination, which, while not necessarily
> bad, seems like useless complexity for a thing we know will get ripped
> out.
>
> When we reviewed the spec, we agreed as a community to say that we should
still get race conditions once the series is implemented, but at least it
helps operators.
Quoting :
"It would also be possible for another instance to steal NUMA resources
from a live migrated instance before the latter’s destination compute host
has a chance to claim them. Until NUMA resource providers are implemented
[3] <https://review.openstack.org/#/c/552924/> and allow for an essentially
atomic schedule+claim operation, scheduling and claiming will keep being
done at different times on different nodes. Thus, the potential for races
will continue to exist."
https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/numa-aware-live-migration.html#proposed-change

So, my own opinion is that yes, the "easy" way out is better than no way.
>From what I undertand (and let's be honest I hadn't time to look at your
code yet), your series don't diverge from the proposed implementation so I
don't see a problem here. If, for some reasons, you need to write an
alternative, just tell us why (and ideally write a spec amendment patch so
the spec is consistent with the series).

-Sylvain




[1] https://review.openstack.org/#/c/576222/
> [2] https://review.openstack.org/#/c/576222/3/nova/compute/manager.py@5897
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180620/2775ab71/attachment.html>


More information about the OpenStack-dev mailing list