G'day OpenStackers. I've got a fun one for the nova gurus among us. It could be that there exists documentation that answers this specific problem, but I am yet to come across it. Recently we've been playing around with GPU PCI Passthrough, using a couple of SMX4 NVIDIA A100s. We don't intend on slicing these up with MIG as they're small cards, and so have just been trying to get passthrough working. Following the documentation [0] , we've got it working in Ubuntu VMs after some troubleshooting, which I will share below in case it helps anyone else; we ran into problems with the default QEMU q35 hole size, and worked around the problem by modifying the grub config. However, we're not yet able to make it work with Windows. For reference, this is on a Caracal 2024.1 deployment of OpenStack. Essentially for the ubuntu 24 VM the description of the problem was: # Problem - on Ubuntu ## Replication Steps: 1. Create new instance from image with properties "hw_machine_type=q35" and "hw_firmware_type=uefi", and OS as Ubuntu 24, using flavor with a100-40g attached 2. Log into instance and install nvidia-drivers-580 ## Bug conditions: lspci shows that gpu is present drivers are able to be installed nvidia-smi command fails with error message unable to connect to drivers dmesg | grep pci - shows error message for pci device: [ 1.180865] pci 0000:05:00.0: BAR 0 [mem 0xff000000-0xffffffff]: can't claim; no compatible bridge window [ 1.182544] pci 0000:05:00.0: BAR 1 [mem 0xfffffff000000000-0xffffffffffffffff 64bit pref]: can't claim; no compatible bridge window [ 1.184541] pci 0000:05:00.0: BAR 3 [mem 0xfffffffffe000000-0xffffffffffffffff 64bit pref]: can't claim; no compatible bridge window [ 1.416571] pci 0000:05:00.0: BAR 1 [mem size 0x1000000000 64bit pref]: can't assign; no space [ 1.418209] pci 0000:05:00.0: BAR 1 [mem 0xfffffff000000000-0xffffffffffffffff 64bit pref]: failed to assign [ 1.420057] pci 0000:05:00.0: BAR 3 [mem size 0x02000000 64bit pref]: can't assign; no space [ 1.423133] pci 0000:05:00.0: BAR 3 [mem 0xfffffffffe000000-0xffffffffffffffff 64bit pref]: failed to assign [ 1.424987] pci 0000:05:00.0: BAR 0 [mem 0x80000000-0x80ffffff]: assigned [ 1.545824] pci_bus 0000:00: Some PCI device resources are unassigned, try booting with pci=realloc lspci -v shows memory BAR0 as disabled, does not show BAR1 or BAR3 at all: root@test-gpu-2:~# lspci -v -s 05:00.0 05:00.0 3D controller: NVIDIA Corporation GA100 [A100 SXM4 40GB] (rev a1) Subsystem: NVIDIA Corporation GA100 [A100 SXM4 40GB] Physical Slot: 0-4 Flags: fast devsel, IRQ 255 Memory at 80000000 (32-bit, non-prefetchable) [disabled] [size=16M] ## Solution: Modify and append the grub configuration with: GRUB_CMDLINE_LINUX="pci=realloc pci=nocrs" Then save and update-grub, and reboot. The nvidia-smi command now works, all 3 memory BARs show up, everything works, no more error messages. This means we're able to have GPU-capable linux images by having this modification baked into the image, or if appropriate, the cloudinit. # Problem - but on Windows However, on Windows (tested on server 2022 and server 2025), we run into the same problem : Windows seemingly cannot allocate enough memory to properly load the device; it appears in device manager, appears to be recognised, but stalls indefinitely during driver installation. Microsoft's available logging isn't as nice or as usable as Ubuntu, but the vibe of the logs is the same: Windows throws an error “This device cannot find enough free resources (Code 12)”. ## Possible Solutions We found that we could pass qemu arguments to the VM that would resolve the issue, but this is not viable in OpenStack as we can't pass the qemu arguments from the flavor or image meta-data. This solution gets thrown around on PVE forums or lxc/libvirt only areas, as it's not really openstack/nova friendly. The arguments (where VMID is the ID of the instance) to make it work were: qm set {VMID} -args '-global q35-pcihost.pci-hole64-size=256G' (run on the nova_libvirt container that controls the instance) As a note, I just could not find any documentation specifically (e.g. on qemu.org's documentation portal) that talks about q35-pcihost features, or specifically how pci-hole64-size interacts with the system. As you can imagine, setting arguments directly in the libvirt container isn't viable, as making modifications to libvirt outside of nova is against the whole cloud-native idea of Nova. It's just not meant to work that way. Alternatively, we've been able to get it working on an i440fx BIOS, as opposed to a q35 in UEFI, but that doesn't really feel like the right way to approach this problem. Any ideas on a good, cloud-native way to either modify our nova.conf, create a flavor, or update an image/image metadata so that a modern q35/UEFI machine running Windows can load the device properly? Thanks in advance. [0] ( https://docs.openstack.org/nova/latest/admin/pci-passthrough.html ) Kind Regards, Joel McLean Cyber Security and Product Development Manager Australia’s First Tier IV Data Centre https://www.micron21.com/ 1300 769 972 03 9751 7618 0407 888 429 joel.mclean@micron21.com Follow us on Twitter and https://m21status.com for important service and system updates. This message is intended for the addressee named above. It may contain privileged or confidential information. If you are not the intended recipient of this message you must not use, copy, distribute or disclose it to anyone other than the addressee. If you have received this message in error please return the message to the sender by replying to it and then delete the message from your computer.