Thanks,
yes, this is my actual config
Compute nova.conf
[pci]
alias = { "vendor_id":"10de", "product_id":"1ff2",
"device_type":"type-PCI", "name":"nvidia-t400" }
device_spec = { "vendor_id": "10de", "product_id": "1ff2" }
Controller nova.conf
[pci]
alias: { "vendor_id":"10de", "product_id":"1ff2",
"device_type":"type-PCI", "name":"nvidia-t400" }
[filter_scheduler]
available_filters = nova.scheduler.filters.all_filters
enabled_filters =
PciPassthroughFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter
I also tried using type-PF instead
of type-PCI, with the same result
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email: francesco.dinucci@na.infn.it
Hi Francesco
Do you have alias defined for nvidia-t400 in the compute node nova.conf file?
Regards
On Fri, Mar 28, 2025 at 11:15 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor
+----------------------------+---------------------------------------+
| Field | Value |
+----------------------------+---------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| description | None |
| disk | 40 |
| id | b729f843-0a66-4e74-9a74-1dc7b1fb246f |
| name | gpu_flavor |
| os-flavor-access:is_public | True |
| properties | pci_passthrough:alias='nvidia-t400:1' |
| ram | 8192 |
| rxtx_factor | 1.0 |
| swap | 0 |
| vcpus | 4 |
+----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email: francesco.dinucci@na.infn.itOn 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this and this:
- installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver
- set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-*
- On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-*
- Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list
- Created an image with --property img_hide_hypervisor_id=true
- Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com)
- VM creation fails with this errors on the controller node
nova-api.log
HTTP exception thrown: Flavor gpu_flavor could not be found.
nova-scheduler.log
Filter PciPassthroughFilter returned 0 hosts
Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']nova-conductor.log
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email: francesco.dinucci@na.infn.it