[nova][CI] GPUs in the gate
Hey all, Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs. >From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4]. So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware. [*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism? [1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf
On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote:
Hey all,
Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs.
From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4].
So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware.
What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support the vgpu feature. It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For those of us that don't know all of the particulars it really does help if you can go from use case to requirements.
[*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism?
[1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf
On Tue, May 7, 2019 at 8:00 PM Clark Boylan <cboylan@sapwetik.org> wrote:
On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote:
Hey all,
Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs.
From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4].
So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware.
What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support the vgpu feature.
It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For those of us that don't know all of the particulars it really does help if you can go from use case to requirements.
Right, apologies, I got ahead of myself. The use case is CI coverage for Nova's VGPU feature. This feature can be summarized (and oversimplified) as "SRIOV for GPUs": a single physical GPU can be split into multiple virtual GPUs (via libvirt's mdev support [5]), each one being assigned to a different guest. We have functional tests in-tree, but no tests with real hardware. So we're looking for a way to get real hardware in the gate. I hope that clarifies things. Let me know if there are further questions. [5] https://libvirt.org/drvnodedev.html#MDEVCap
[*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism?
[1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf
On 2019-05-08 08:46:56 -0400 (-0400), Artom Lifshitz wrote: [...]
The use case is CI coverage for Nova's VGPU feature. This feature can be summarized (and oversimplified) as "SRIOV for GPUs": a single physical GPU can be split into multiple virtual GPUs (via libvirt's mdev support [5]), each one being assigned to a different guest. We have functional tests in-tree, but no tests with real hardware. So we're looking for a way to get real hardware in the gate. [...]
Long shot, but since you just need the feature provided and not the performance it usually implies, are there maybe any open source emulators which provide the same instruction set for conformance testing purposes? -- Jeremy Stanley
On Wed, May 8, 2019 at 9:30 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
Long shot, but since you just need the feature provided and not the performance it usually implies, are there maybe any open source emulators which provide the same instruction set for conformance testing purposes?
Something like that exists for network cards. It's called netdevsim [1], and it's been mentioned in the SRIOV live migration spec [2]. However to my knowledge nothing like that exists for GPUs. [1] https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.16-Networking [2] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/libvir...
Le mer. 8 mai 2019 à 20:27, Artom Lifshitz <alifshit@redhat.com> a écrit :
On Wed, May 8, 2019 at 9:30 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
Long shot, but since you just need the feature provided and not the performance it usually implies, are there maybe any open source emulators which provide the same instruction set for conformance testing purposes?
Something like that exists for network cards. It's called netdevsim [1], and it's been mentioned in the SRIOV live migration spec [2]. However to my knowledge nothing like that exists for GPUs.
libvirt provides us a way to fake mediated devices attached to instances but we still need to lookup sysfs for either knowing all the physical GPUs or creating a new mdev so that's where it's not possibleto have an emulator AFAICU. -Sylvain [1]
https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.16-Networking [2] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/libvir...
On Wed, 2019-05-08 at 13:27 +0000, Jeremy Stanley wrote:
On 2019-05-08 08:46:56 -0400 (-0400), Artom Lifshitz wrote: [...]
The use case is CI coverage for Nova's VGPU feature. This feature can be summarized (and oversimplified) as "SRIOV for GPUs": a single physical GPU can be split into multiple virtual GPUs (via libvirt's mdev support [5]), each one being assigned to a different guest. We have functional tests in-tree, but no tests with real hardware. So we're looking for a way to get real hardware in the gate.
[...]
Long shot, but since you just need the feature provided and not the performance it usually implies, are there maybe any open source emulators which provide the same instruction set for conformance testing purposes? i tried going down this route looking at the netdev shim module for emulationg nics to test generic sriov but it does not actully do the pcie emulation of the vfs.
for vgpus im not aware of any kernel or userspace emulation we could use to test teh end to end workflow with libvirt. if anywone else know of one that would be an interesting alternative to pursue. also if any kernel developers want to pcie vf emulation to the netdevsim module it really would be awsome to be able to use that to test sriov nics in the gate without hardware.
On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote:
Hey all,
Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs.
From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4].
So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware.
What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support the vgpu feature.
On Tue, 2019-05-07 at 19:56 -0400, Clark Boylan wrote: this is correct i asked mnaser about this in the past which is why he made the gpu nodeset available initally but after checking with sylvain and confiming the gpu model available via vexxhost we determined they could not be used to test vgpu support.
It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For those of us that don't know all of the particulars it really does help if you can go from use case to requirements.
effectly we just want to test the mdev based vgpu support in the libvirt driver. nvidia locks down support for vGPU to there tesla and quadro cards and requires a license server to be running to enabled the use fo teh grid driver. as a resutl to be able to test this feaute in the upstream gate we would need a gpu that is on the supported list of the nvida grid driver and a license server(could just use the trial licenses) so that we can use the vgpu feature. As vfio medatione devices are an extention of the sr-iov framework bulit on top fo the vfio stack the only simple way to these this would be via a baremetal host as we do not have a way to do a double passthough in a way that preserves sriov fucntionality.( the way i descibed in my last email is just a theory and openstack is missing support for vIOMMU support in anycase even if it did work)
[*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism?
[1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf
Hey all,
Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs.
From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4]. Intel cards is a misnomer GVT is currently only supported by the integrated gpu on intel cpus which was removed form xeon cpus when GVT support was added. in the future with
On Tue, 2019-05-07 at 13:47 -0400, Artom Lifshitz wrote: the descrete gpus from intel slated for release sometime in 2020 we should hopefully have intel cards that actully support GVT assuming that is on there gpu product roadmap but i can see how it would not be given they developed the tech for there integrated gpu. it would also be intersting to test amd gpus using there sriov approach but i think NVIDA tesla gpus would be the shortest path forword.
So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware.
[*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism?
i have a theory about how this "migth" be possible but openstack is missing the features requried to pull it off. i may test it locally with libvirt but the only why i think this could work would be to do a full passthough of the PF to an l1 guest using the q35 chipset with a viommu(not supported in nova) with hypervior hiding enabled and then run the grid driver in the l1 guest to expose a mdev to the l2 guest. ironic would be much simpler
[1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf
participants (5)
-
Artom Lifshitz
-
Clark Boylan
-
Jeremy Stanley
-
Sean Mooney
-
Sylvain Bauza