[nova] Do we need to copy pci_devices to target cell DB during cross-cell resize?
While working on the code to create instance-related data in the target cell database during a cross-cell resize I noticed I wasn't copying over pci_devices [1] but when looking at the PciDevice object create/save methods, those aren't really what I'm looking for here. And looking at the data model, there is a compute_node_id field which is the compute_nodes.id primary key which won't match the target cell DB. I am not very familiar with the PCI device manager code and data model (it's my weakest area in nova lo these many years) but looking closer at this, am I correct in understanding that the PciDevice object and data model is really more about the actual inventory and allocations of PCI devices on a given compute node and therefore it doesn't really make sense to need to copy that data over to the target cell database. During a cross-cell resize, the scheduler is going to pick a target host in another cell and claim standard resources (VCPU, MEMORY_MB and DISK_GB) in placement, but things like NUMA/PCI claims won't happen until we do a ResourceTracker.resize_claim on the target host in the target cell. In that case, it seems the things I only need to care about mirroring is instance.pci_requests and instance.numa_topology, correct? Since those are the user-requested (via flavor/image/port) resources which will then result in PciDevice and NUMA allocations on the target host. I'm just looking for confirmation from others that better understand the data model in this area. [1] https://review.openstack.org/#/c/627892/5/nova/conductor/tasks/cross_cell_mi... -- Thanks, Matt
While working on the code to create instance-related data in the target cell database during a cross-cell resize I noticed I wasn't copying over pci_devices [1] but when looking at the PciDevice object create/save methods, those aren't really what I'm looking for here. And looking at the data model, there is a compute_node_id field which is the compute_nodes.id primary key which won't match the target cell DB.
I am not very familiar with the PCI device manager code and data model (it's my weakest area in nova lo these many years) but looking closer at this, am I correct in understanding that the PciDevice object and data model is really more about the actual inventory and allocations of PCI devices on a given compute node and therefore it doesn't really make sense to need to copy that data over to the target cell database.
During a cross-cell resize, the scheduler is going to pick a target host in another cell and claim standard resources (VCPU, MEMORY_MB and DISK_GB) in placement, but things like NUMA/PCI claims won't happen until we do a ResourceTracker.resize_claim on the target host in the target cell. In that case, it seems the things I only need to care about mirroring is instance.pci_requests and instance.numa_topology, correct? yes i belive that is correct.
On Tue, 2019-01-22 at 09:20 -0600, Matt Riedemann wrote: the scheduler when seleciting the host in the remote numa cell will need to run the pci passthough filter to validate that pci_request against the destingation host. you are specifically doing a resize so you dont need to regenerate a new xml on the source node before starting a live migration sice its not a live migration but you might want to premtpvily allcoate the pci devices on the destination if you want to prevent a race with other hosts. That said for stien it may be better to declare that out of scope. its really not any more racy then spawning an instace as we dont claim the device untill we get to the compute node anyway today. the instance.numa_topology shoudl really be recalulated for the target host also. you do not want to require the destination host to place the vm with the original numa toptolgy from the source node. so i think you need to propagate the numa related request which are all in the flavor/image but i dont think you need to copy the instance numa_topology object. its not a live migration so provided the numa toplogy filter says the host is vlaid you are free to recalualte the numa toplogy form scratch when it lands on the compute node based on the image and flavor values.
Since those are the user-requested (via flavor/image/port) resources which will then result in PciDevice and NUMA allocations on the target host.
I'm just looking for confirmation from others that better understand the data model in this area.
[1] https://review.openstack.org/#/c/627892/5/nova/conductor/tasks/cross_cell_mi...
On 1/22/2019 12:19 PM, Sean Mooney wrote:
you are specifically doing a resize so you dont need to regenerate a new xml on the source node before starting a live migration sice its not a live migration but you might want to premtpvily allcoate the pci devices on the destination if you want to prevent a race with other hosts. That said for stien it may be better to declare that out of scope. its really not any more racy then spawning an instace as we dont claim the device untill we get to the compute node anyway today.
Correct, and the plan for cross-cell resize is to do the same RT.resize_claim on the target host in the target cell before trying to move anything, which is the same thing we do in ComputeManager.prep_resize for a normal resize within the same cell.
the instance.numa_topology shoudl really be recalulated for the target host also. you do not want to require the destination host to place the vm with the original numa toptolgy from the source node. so i think you need to propagate the numa related request which are all in the flavor/image but i dont think you need to copy the instance numa_topology object. its not a live migration so provided the
The instance.numa_topology is calculated from the flavor and image during the initial server create: https://github.com/openstack/nova/blob/dd84e75260c3c919398536f7d05764713dc1c... And during the MoveClaim: https://github.com/openstack/nova/blob/dd84e75260c3c919398536f7d05764713dc1c... So yeah it looks like I don't really have to worry about updating/setting instance.numa_topology in the target cell DB during the resize although it seems pretty weird that we leave that stale information in the instances table during a resize (it's also stale in the RequestSpec - I think Alex Xu reported a bug for that). -- Thanks, Matt
participants (2)
-
Matt Riedemann
-
Sean Mooney