[nova] vGPU not attached to VM using Cyborg with OpenStack Caracal
Hello, I'm facing an issue with vGPU assignment through Cyborg in OpenStack Caracal. Although Cyborg reports that the ARQ is bound successfully, the vGPU device is not attached to the VM in Nova. - Environment: OS: Ubuntu 22.04.3 LTS Kernel: 5.15.0-78-generic Nova API: 29.2.0 Cyborg API: 14.1.0.dev1 GPU: NVIDIA A100 PCIe 40GB - Device Profile: openstack accelerator device profile list +--------------------------------------+------+-----------------------------------------------------------------------------+ | uuid | name | groups | +--------------------------------------+------+-----------------------------------------------------------------------------+ | 8425925f-7a3c-4336-8a1c-270f481a28ae | vgpu | [{'resources:VGPU': '1', 'trait:CUSTOM_NVIDIA_20F1_A100_1_5C': 'required'}] | +--------------------------------------+------+-----------------------------------------------------------------------------+ Flavor: openstack flavor show vgpu --column properties +------------+-----------------------------+ | Field | Value | +------------+-----------------------------+ | properties | accel:device_profile='vgpu' | +------------+-----------------------------+ Result After VM Creation: openstack accelerator arq list +--------------------------------------+------+-----------------------------------------------------------------------------+ | uuid | name | groups | +--------------------------------------+------+-----------------------------------------------------------------------------+ | 8425925f-7a3c-4336-8a1c-270f481a28ae | vgpu | [{'resources:VGPU': '1', 'trait:CUSTOM_NVIDIA_20F1_A100_1_5C': 'required'}] | +--------------------------------------+------+-----------------------------------------------------------------------------+ ARQ shows Bound with an attach_handle_type of MDEV, but the GPU device is not visible inside the VM. mdevctl list # Shows UUID and mdev type virsh dumpxml instance-xxxx # No mdev device listed Logs (Summary) Cyborg logs indicate ARQ binding was successful and attach handle allocated (MDEV). Nova-compute logs show: Ignoring accelerator requests for instance ... Supported Attach handle types: {'PCI'}. But got these unsupported types: {'MDEV'}. Question. Is there something I’m missing in the integration between Cyborg and Nova for MDEV attach_handle_type? It seems like Nova is not yet supporting MDEV, even though Cyborg bound the ARQ correctly. Any suggestions or known workarounds would be appreciated. Thank you!
so i could be wrong but i looked into this a while agao and off the top of my head i believe that while the work to support mdev based gpus in cyborg was merged the nova half was never done so if im not mistaken the only way to do vGPU with opentack today is via nova. https://github.com/openstack/nova/blob/725a307693806e6e32834198e23be75f771be... On 04/04/2025 09:53, ts.jung@okestro.com wrote:
Hello, I'm facing an issue with vGPU assignment through Cyborg in OpenStack Caracal. Although Cyborg reports that the ARQ is bound successfully, the vGPU device is not attached to the VM in Nova.
- Environment: OS: Ubuntu 22.04.3 LTS
Kernel: 5.15.0-78-generic
Nova API: 29.2.0
Cyborg API: 14.1.0.dev1
GPU: NVIDIA A100 PCIe 40GB
- Device Profile: openstack accelerator device profile list +--------------------------------------+------+-----------------------------------------------------------------------------+ | uuid | name | groups | +--------------------------------------+------+-----------------------------------------------------------------------------+ | 8425925f-7a3c-4336-8a1c-270f481a28ae | vgpu | [{'resources:VGPU': '1', 'trait:CUSTOM_NVIDIA_20F1_A100_1_5C': 'required'}] | +--------------------------------------+------+-----------------------------------------------------------------------------+
Flavor: openstack flavor show vgpu --column properties +------------+-----------------------------+ | Field | Value | +------------+-----------------------------+ | properties | accel:device_profile='vgpu' | +------------+-----------------------------+
Result After VM Creation: openstack accelerator arq list +--------------------------------------+------+-----------------------------------------------------------------------------+ | uuid | name | groups | +--------------------------------------+------+-----------------------------------------------------------------------------+ | 8425925f-7a3c-4336-8a1c-270f481a28ae | vgpu | [{'resources:VGPU': '1', 'trait:CUSTOM_NVIDIA_20F1_A100_1_5C': 'required'}] | +--------------------------------------+------+-----------------------------------------------------------------------------+
ARQ shows Bound with an attach_handle_type of MDEV, but the GPU device is not visible inside the VM.
mdevctl list # Shows UUID and mdev type
virsh dumpxml instance-xxxx # No mdev device listed Logs (Summary)
Cyborg logs indicate ARQ binding was successful and attach handle allocated (MDEV).
Nova-compute logs show: Ignoring accelerator requests for instance ... Supported Attach handle types: {'PCI'}. But got these unsupported types: {'MDEV'}.
Question. Is there something I’m missing in the integration between Cyborg and Nova for MDEV attach_handle_type? It seems like Nova is not yet supporting MDEV, even though Cyborg bound the ARQ correctly.
Any suggestions or known workarounds would be appreciated.
Thank you!
participants (2)
-
Sean Mooney
-
ts.jung@okestro.com