[nova] mdev management for vgpu
Hello, I am tying a few things with vGPU on nova and I struggle with the mdev management Nova does. My environment: - Nvidia A2 pGPU - openstack caracal What I did : - installed the nvidia grid host driver on the compute - enabled the virtual function of the pGPU - added in nova.conf the mdev type wanted and the pci address of the virtual functions - restarted nova - created a flavor with the property resources:VGPU=1 (basicaly followed this documentation : https://docs.openstack.org/nova/latest/admin/virtual-gpu.html) the only diference is that I did not put the pGPU pci address in the nova.conf but the address of its virtual functions, so in my case 0000:41:00.4 and not 0000:41:00.0 (the RP were not detected otherwise) From my understanding of this specs sheet (https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/ad...) and some experimentation, When a vgpu is requested by the end user : - Nova looks at the mdevs already running and use them if available. - If there is no mdev available, Nova looks at the RP tree to find mdev capable pci devices that have the specific mdev type available. - If there is such pci device available, Nova create the mdev with a UUID and create the domain XML with this UUID inside. - If no such device is available, Nova returns no host available as an error. My questions are : - Mdev are not persistent across reboot, but when it happens, nova crash at boot because the mdevs are missing, just recreating mdev does not fix the issue as they need to have the same UUID. One could fix the issue by making mdev persistent using a tool like mdevctl, but by creating the mdev the first time I seem to me that Nova tries to manage mdev, is it normal for it to require manual intervention afterward ? Or am I missing something ? - In the nova configuration, is it normal to diverge from the documentation and enter the VF PCI addresses to detect the RF ? Thank you in advance for your help.
On 11/8/24 02:04, Mickael Razzouk wrote: > Hello, I am tying a few things with vGPU on nova and I struggle with the mdev management Nova does. > > My environment: > - Nvidia A2 pGPU > - openstack caracal > > What I did : > - installed the nvidia grid host driver on the compute > - enabled the virtual function of the pGPU > - added in nova.conf the mdev type wanted and the pci address of the virtual functions > - restarted nova > - created a flavor with the property resources:VGPU=1 > > (basicaly followed this documentation : https://docs.openstack.org/nova/latest/admin/virtual-gpu.html) > > the only diference is that I did not put the pGPU pci address in the nova.conf but the address of its virtual functions, so in my case 0000:41:00.4 and not 0000:41:00.0 (the RP were not detected otherwise) > >>From my understanding of this specs sheet (https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/add-support-for-vgpu.html) and some experimentation, > When a vgpu is requested by the end user : > - Nova looks at the mdevs already running and use them if available. > - If there is no mdev available, Nova looks at the RP tree to find mdev capable pci devices that have the specific mdev type available. > - If there is such pci device available, Nova create the mdev with a UUID and create the domain XML with this UUID inside. > - If no such device is available, Nova returns no host available as an error. > > My questions are : > > - Mdev are not persistent across reboot, but when it happens, nova crash at boot because the mdevs are missing, just recreating mdev does not fix the issue as they need to have the same UUID. > One could fix the issue by making mdev persistent using a tool like mdevctl, but by creating the mdev the first time I seem to me that Nova tries to manage mdev, is it normal for it to require manual intervention afterward ? > Or am I missing something ? I can't answer both of your questions but I can say that for 2024.1 (Caracal), no you are not missing something -- mdevs are not persisted across reboot. See https://bugs.launchpad.net/nova/+bug/1900800 for details. However in 2024.2 (Dalmatian) we added support for persistent mdevs (note this requires libvirt >= 7.3.0): https://docs.openstack.org/releasenotes/nova/2024.2.html HTH, -melwitt > - In the nova configuration, is it normal to diverge from the documentation and enter the VF PCI addresses to detect the RF ? > > Thank you in advance for your help.
Seconding Melanie on this. Moving to 2024.2 will persist them, otherwise you need some fun startup scripts to recreate mvdevs. Typically along the lines of: 1. Ensure modprobes 2. sriov-manage -e ALL 3. some fancy script that evaluates the libvirt contents then figures out (placement service.. and other apis) what nvidia profile: * echo ‘<mdev-uuid-reservation>’ /sys/bus/pci/devices/<pci-dev>/mdev_supported_types/nvidia-<profile>/create Best of luck. vGPU is fun. Karl Kloppenborg Chief Technology Officer m: +61 437 239 565 resetdata.com<https://resetdata.com/> [cid:reset_69557fc2-1d63-4932-b5fd-93bd4f39ca7b.png] ResetData supports Mandatory Client Related Financial Disclosures – Scope 3 Emissions Reporting For more information on the phasing of these requirements for business please visit; https://treasury.gov.au/sites/default/files/2024-01/c2024-466491-policy-state.pdf<https://treasury.gov.au/sites/default/files/2024-01/c2024-466491-policy-state.pdf> This email transmission is intended only for the addressee / person responsible for delivery of the message to such person and may contain confidential or privileged information. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you, nor may you use, review, disclose, disseminate or copy any information contained in or attached to it. Whilst this email has been checked for viruses, the sender does not warrant that any attachments are free from viruses or other defects. You assume all liability for any loss, damage or other consequences which may arise from opening or using the attachments. If you received this e-mail in error please delete it and any attachments and kindly notify us by immediately sending an email to contact@resetdata.com.au<mailto:contact@resetdata.com.au> From: melanie witt <melwittt@gmail.com> Date: Saturday, 9 November 2024 at 6:20 am To: Mickael Razzouk <mickael.razzouk@infomaniak.com>, openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [nova] mdev management for vgpu [You don't often get email from melwittt@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] On 11/8/24 02:04, Mickael Razzouk wrote: > Hello, I am tying a few things with vGPU on nova and I struggle with the mdev management Nova does. > > My environment: > - Nvidia A2 pGPU > - openstack caracal > > What I did : > - installed the nvidia grid host driver on the compute > - enabled the virtual function of the pGPU > - added in nova.conf the mdev type wanted and the pci address of the virtual functions > - restarted nova > - created a flavor with the property resources:VGPU=1 > > (basicaly followed this documentation : https://docs.openstack.org/nova/latest/admin/virtual-gpu.html) > > the only diference is that I did not put the pGPU pci address in the nova.conf but the address of its virtual functions, so in my case 0000:41:00.4 and not 0000:41:00.0 (the RP were not detected otherwise) > >>From my understanding of this specs sheet (https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/add-support-for-vgpu.html) and some experimentation, > When a vgpu is requested by the end user : > - Nova looks at the mdevs already running and use them if available. > - If there is no mdev available, Nova looks at the RP tree to find mdev capable pci devices that have the specific mdev type available. > - If there is such pci device available, Nova create the mdev with a UUID and create the domain XML with this UUID inside. > - If no such device is available, Nova returns no host available as an error. > > My questions are : > > - Mdev are not persistent across reboot, but when it happens, nova crash at boot because the mdevs are missing, just recreating mdev does not fix the issue as they need to have the same UUID. > One could fix the issue by making mdev persistent using a tool like mdevctl, but by creating the mdev the first time I seem to me that Nova tries to manage mdev, is it normal for it to require manual intervention afterward ? > Or am I missing something ? I can't answer both of your questions but I can say that for 2024.1 (Caracal), no you are not missing something -- mdevs are not persisted across reboot. See https://bugs.launchpad.net/nova/+bug/1900800 for details. However in 2024.2 (Dalmatian) we added support for persistent mdevs (note this requires libvirt >= 7.3.0): https://docs.openstack.org/releasenotes/nova/2024.2.html HTH, -melwitt > - In the nova configuration, is it normal to diverge from the documentation and enter the VF PCI addresses to detect the RF ? > > Thank you in advance for your help.
Hello Mickael, There was a mailing thread about mdevs and recreating them a while back [1]. We are not yet on Dalmatian for Nova so we have not tested that feature but are using a similar script to what is linked in that thread. /Tobias [1] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/A5G2IGCWGEM6BEQU52UMA45WISTBU2AA/ On Fri, Nov 08, 2024 at 11:18:50AM UTC, melanie witt wrote: > On 11/8/24 02:04, Mickael Razzouk wrote: > > Hello, I am tying a few things with vGPU on nova and I struggle with the mdev management Nova does. > > > > My environment: > > - Nvidia A2 pGPU > > - openstack caracal > > > > What I did : > > - installed the nvidia grid host driver on the compute > > - enabled the virtual function of the pGPU > > - added in nova.conf the mdev type wanted and the pci address of the virtual functions > > - restarted nova > > - created a flavor with the property resources:VGPU=1 > > > > (basicaly followed this documentation : https://docs.openstack.org/nova/latest/admin/virtual-gpu.html) > > > > the only diference is that I did not put the pGPU pci address in the nova.conf but the address of its virtual functions, so in my case 0000:41:00.4 and not 0000:41:00.0 (the RP were not detected otherwise) > > > > > From my understanding of this specs sheet (https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/add-support-for-vgpu.html) and some experimentation, > > When a vgpu is requested by the end user : > > - Nova looks at the mdevs already running and use them if available. > > - If there is no mdev available, Nova looks at the RP tree to find mdev capable pci devices that have the specific mdev type available. > > - If there is such pci device available, Nova create the mdev with a UUID and create the domain XML with this UUID inside. > > - If no such device is available, Nova returns no host available as an error. > > > > My questions are : > > > > - Mdev are not persistent across reboot, but when it happens, nova crash at boot because the mdevs are missing, just recreating mdev does not fix the issue as they need to have the same UUID. > > One could fix the issue by making mdev persistent using a tool like mdevctl, but by creating the mdev the first time I seem to me that Nova tries to manage mdev, is it normal for it to require manual intervention afterward ? > > Or am I missing something ? > > I can't answer both of your questions but I can say that for 2024.1 > (Caracal), no you are not missing something -- mdevs are not persisted > across reboot. See https://bugs.launchpad.net/nova/+bug/1900800 for details. > > However in 2024.2 (Dalmatian) we added support for persistent mdevs (note > this requires libvirt >= 7.3.0): > > https://docs.openstack.org/releasenotes/nova/2024.2.html > > HTH, > -melwitt > > > - In the nova configuration, is it normal to diverge from the documentation and enter the VF PCI addresses to detect the RF ? > > > > Thank you in advance for your help. >
Hello ! Thank you all for your responses, It helped me to know it is a known issue and to determine what to do next. Yeah, vGPU is fun, after completely figuring out vgpu management I will look into live migration. Have a great day, Mickael
participants (4)
-
Karl Kloppenborg
-
melanie witt
-
Mickael Razzouk
-
Tobias Urdin