[nova] Review guide for PCI tracking for Placement patches
Hi Nova, The first batch of patches are up for review for the PCI tracking for Placement feature. These mostly covers two aspects of the spec[1]: 1) renaming [pci]passthrough_whitelist to [pci]device_spec 2) pci inventory reporting to placement, excluding existing PCI allocation healing in placement This covers the first 4 sub chapters of Proposed Change chapter of the spec[1] up until "PCI alias configuration". I noted intentional deviations from the spec in the spec review [2] and I will push a follow up to the spec at some point fixing those. I tried to do it in small steps hence the long list of commits[3]: #2) pci inventory reporting to placement, excluding existing PCI allocation healing in placement 5827d56310 Stop if tracking is disable after it was enabled before a4b5788858 Support [pci]device_spec reconfiguration 10642c787a Reject devname based device_spec config b0ad05fb69 Ignore PCI devs with physical_network tag f5a34ee441 Reject mixed VF rc and trait config c60b26014f Reject PCI dependent device config 5cf7325221 Extend device_spec with resource_class and traits eff0df6a98 Basics for PCI Placement reporting #1) renaming [pci]passthrough_whitelist to [pci]device_spec adfe34080a Rename whitelist in tests ea955a0c15 Rename exception.PciConfigInvalidWhitelist to PciConfigInvalidSpec 55770e4c14 Rename [pci]passthrough_whitelist to device_spec There is a side track branching out from "adfe34080a Rename whitelist in tests" to clean up the device spec handling[4]: 514500b5a4 Move __str__ to the PciAddressSpec base class 3a6198c8fb Fix type annotation of pci.Whitelist class f70adbb613 Remove unused PF checking from get_function_by_ifname b7eef53b1d Clean up mapping input to address spec types 93bbd67101 Poison /sys access via various calls in test 467ef91a86 Remove dead code from PhysicalPciAddress 233212d30f Fix PciAddressSpec descendants to call super.__init__ ad5bd46f46 Unparent PciDeviceSpec from PciAddressSpec cef0d2de4c Extra tests for remote managed dev spec 2fa2825afb Add more test coverage for devname base dev spec adfe34080a Rename whitelist in tests This is not a mandatory part of the feature but I think they improve the code in hand and even fixing some small bugs. I will continue with adding allocation healing for existing PCI allocations. Any feedback is highly appreciated. Cheers, gibi [1] https://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-devi... [2] https://review.opendev.org/c/openstack/nova-specs/+/791047 [3] https://review.opendev.org/q/topic:bp/pci-device-tracking-in-placement [4] https://review.opendev.org/q/topic:bp/pci-device-spec-cleanup
On Tue, Jun 21 2022 at 02:04:01 PM +02:00:00, Balazs Gibizer <gibi@redhat.com> wrote:
Hi Nova,
The first batch of patches are up for review for the PCI tracking for Placement feature. These mostly covers two aspects of the spec[1]: 1) renaming [pci]passthrough_whitelist to [pci]device_spec 2) pci inventory reporting to placement, excluding existing PCI allocation healing in placement
This covers the first 4 sub chapters of Proposed Change chapter of the spec[1] up until "PCI alias configuration". I noted intentional deviations from the spec in the spec review [2] and I will push a follow up to the spec at some point fixing those.
I tried to do it in small steps hence the long list of commits[3]:
#2) pci inventory reporting to placement, excluding existing PCI allocation healing in placement 5827d56310 Stop if tracking is disable after it was enabled before a4b5788858 Support [pci]device_spec reconfiguration 10642c787a Reject devname based device_spec config b0ad05fb69 Ignore PCI devs with physical_network tag f5a34ee441 Reject mixed VF rc and trait config c60b26014f Reject PCI dependent device config 5cf7325221 Extend device_spec with resource_class and traits eff0df6a98 Basics for PCI Placement reporting #1) renaming [pci]passthrough_whitelist to [pci]device_spec adfe34080a Rename whitelist in tests ea955a0c15 Rename exception.PciConfigInvalidWhitelist to PciConfigInvalidSpec 55770e4c14 Rename [pci]passthrough_whitelist to device_spec
There is a side track branching out from "adfe34080a Rename whitelist in tests" to clean up the device spec handling[4]:
514500b5a4 Move __str__ to the PciAddressSpec base class 3a6198c8fb Fix type annotation of pci.Whitelist class f70adbb613 Remove unused PF checking from get_function_by_ifname b7eef53b1d Clean up mapping input to address spec types 93bbd67101 Poison /sys access via various calls in test 467ef91a86 Remove dead code from PhysicalPciAddress 233212d30f Fix PciAddressSpec descendants to call super.__init__ ad5bd46f46 Unparent PciDeviceSpec from PciAddressSpec cef0d2de4c Extra tests for remote managed dev spec 2fa2825afb Add more test coverage for devname base dev spec adfe34080a Rename whitelist in tests
This is not a mandatory part of the feature but I think they improve the code in hand and even fixing some small bugs.
I will continue with adding allocation healing for existing PCI allocations.
Any feedback is highly appreciated.
Pinging this thread as I would like to ask for at least a high level review round to see that the implementation direction is OK before I produce the next bunch of commits of the series.
Cheers, gibi
[1] https://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-devi... [2] https://review.opendev.org/c/openstack/nova-specs/+/791047 [3] https://review.opendev.org/q/topic:bp/pci-device-tracking-in-placement [4] https://review.opendev.org/q/topic:bp/pci-device-spec-cleanup
Hi, Top posting as I wanted to give an update on implementation progress of the feature. As before I have one main patch series and one additional side track providing improvements to PCI DeviceSpec handling that is not mandatory for the feature itself. The main series starts at [5] but it now has a bug fix dependency [6] pulled before it. Now the main series is in a mergable state as it contains the complete PCI inventory handling for the feature and this logic can be enabled independently from the, yet to be written, scheduling part. 1833394042 Allow enabling PCI tracking in Placement <-- inventory reporting can be enabled by nova.conf 1eabfde2a1 Handle PCI dev reconf with allocations <-- allocation healing works 74dc70ad04 Heal PCI allocation during resize e1af40959a Heal missing simple PCI allocation in the resource tracker a520649516 Retry /reshape at provider generation conflict f7e1ed838f Move provider_tree RP creation to PciResourceProvider <-- inventory reporting works 2d08e28eb3 Stop if tracking is disable after it was enabled before 742bc26da0 Support [pci]device_spec reconfiguration e445964d59 Reject devname based device_spec config b796b56622 Ignore PCI devs with physical_network tag 81ba9cf1bf Reject mixed VF rc and trait config 734fa580c3 Reject PCI dependent device config fd725ce577 Extend device_spec with resource_class and traits 5f4128b188 Basics for PCI Placement reporting a588df760f Rename whitelist in tests <-- this is where the side track branches out 646e1e69be Rename exception.PciConfigInvalidWhitelist to PciConfigInvalidSpec d26ff3b695 Rename [pci]passthrough_whitelist to device_spec d275c20bca Add compute restart capability for libvirt func tests 5b3e6c1146 Poison /sys access via various calls in test <-- main series starts 575c15df7a Update RequestSpec.pci_request for resize <-- bugfix for 1983753 7b0a1e2b30 Reproducer for bug 1983753 The side track starts at [7] at the middle of the main series. 983dfe69d6 Move __str__ to the PciAddressSpec base class bc24686626 Fix type annotation of pci.Whitelist class 6941757d06 Remove unused PF checking from get_function_by_ifname 6c7903c11c Clean up mapping input to address spec types 30d7c1eadf Remove dead code from PhysicalPciAddress af649c184b Fix PciAddressSpec descendants to call super.__init__ 238a6174e8 Unparent PciDeviceSpec from PciAddressSpec 6836dba493 Extra tests for remote managed dev spec 5d85ec7829 Add more test coverage for devname base dev spec a588df760f Rename whitelist in tests <-- this is the common base with the main series Next I will continue with the last part, the scheduling side, of the feature. Any feedback is highly appreciated. Cheers, gibi [1] https://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-devi... [2] https://review.opendev.org/c/openstack/nova-specs/+/791047 [3] https://review.opendev.org/q/topic:bp/pci-device-tracking-in-placement [4] https://review.opendev.org/q/topic:bp/pci-device-spec-cleanup [5] https://review.opendev.org/c/openstack/nova/+/844627/ [6] https://review.opendev.org/c/openstack/nova/+/852296 [7] https://review.opendev.org/c/openstack/nova/+/844625
participants (1)
-
Balazs Gibizer