[nova] live migration with the NUMA topology

Matt Riedemann mriedemos at gmail.com
Thu Dec 12 13:59:12 UTC 2019


On 12/12/2019 7:24 AM, Brin Zhang(张百林) wrote:
> I have a question, if the destination server's NUMA topology (e.g. 
> nume_node=2) < source server's NUMA topology (e.g. numa_noed=4) in a 
> instance. If I am living migration *this* instance, what will be 
> happened? Rollback and keep the instance to the original status? Or make 
> it to ERROR? In that SPEC I had not find the details about the red 
> description in "Third, information about the instance’s new NUMA 
> characteristics needs to be generated on the destination (an 
> InstanceNUMATopolgy object is not enough, more on that later)", or lack 
> of careful reading J. Anyway, I want to know how to deal with this NUMA 
> topology during live migration?

Artom can answer this in detail but I would expect the claim to fail on 
the dest host here:

https://github.com/openstack/nova/blob/20.0.0/nova/compute/manager.py#L6656

Which will be handled here in conductor:

https://github.com/openstack/nova/blob/20.0.0/nova/conductor/tasks/live_migrate.py#L502

And trigger a "reschedule" to an alternate host. If we run out of 
alternates then MaxRetriesExceeded would be raised:

https://github.com/openstack/nova/blob/20.0.0/nova/conductor/tasks/live_migrate.py#L555

And handled here as NoValidHost:

https://github.com/openstack/nova/blob/20.0.0/nova/conductor/manager.py#L457

The vm_state should be unchanged (stay ACTIVE) but the migration status 
will go to "error".

Artom has been working on functional tests [1] but I'm not sure if they 
cover this kind of scenario - I'd hope they would.

Of course the simpler answer might be, and it would be cool if it is, 
the scheduler should not select the dest host that can't fit the 
instance so we don't even get to the low-level compute resource claim.

[1] https://review.opendev.org/#/c/672595/

-- 

Thanks,

Matt



More information about the openstack-discuss mailing list