PCI passthrough - flavor not found
Hi all, I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>: * installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com) * VM creation fails with this errors on the controller node /nova-api.log/ HTTP exception thrown: Flavor gpu_flavor could not be found. /nova-scheduler.log/ Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)'] /nova-conductor.log/ Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available. If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config? Thanks in advance -- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email:francesco.dinucci@na.infn.it
Hi Francesco, We had something similar for our A100 GPUs, we had to add "device_type":"type-PF" vs PCI and then it worked. Maybe give that a go. I've seen it mentioned on the list before. Regards Mike On Fri, Mar 28, 2025 at 11:00 AM Francesco Di Nucci < francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b> :
- installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver - set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* - On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* - Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list - Created an image with --property img_hide_hypervisor_id=true - Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com) - VM creation fails with this errors on the controller node
*nova-api.log*
HTTP exception thrown: Flavor gpu_flavor could not be found.
*nova-scheduler.log*
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
*nova-conductor.log*
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email: francesco.dinucci@na.infn.it
Hi Francesco I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... compute nova.conf: [pci] device_spec = { "vendor_id": "10de", "product_id": "25b6" } passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" } alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" } control nova.conf: alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" } May be you can review the config. Regards On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci < francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b> :
- installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver - set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* - On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* - Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list - Created an image with --property img_hide_hypervisor_id=true - Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com) - VM creation fails with this errors on the controller node
*nova-api.log*
HTTP exception thrown: Flavor gpu_flavor could not be found.
*nova-scheduler.log*
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
*nova-conductor.log*
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email: francesco.dinucci@na.infn.it
Thank you, I tried but the result is still the same, it does not find the flavor, even though it appears on CLI... $ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+ -- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email:francesco.dinucci@na.infn.it On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... <https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/>
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>:
* installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com <http://my-gpu-node.example.com>) * VM creation fails with this errors on the controller node
/nova-api.log/
HTTP exception thrown: Flavor gpu_flavor could not be found.
/nova-scheduler.log/
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
/nova-conductor.log/
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
Yes, the flavor is public, so the project/tenant shouldn't matter: $ openstack flavor list +--------------------------------------+-----------------------+-------+------+-----------+-------+-----------+ | ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public | +--------------------------------------+-----------------------+-------+------+-----------+-------+-----------+ (...) | b729f843-0a66-4e74-9a74-1dc7b1fb246f | gpu_flavor | 8192 | 40 | 0 | 4 | True | (...) +--------------------------------------+-----------------------+-------+------+-----------+-------+-----------+ Thanks -- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email:francesco.dinucci@na.infn.it On 28/03/25 13:47, Oliver Weinmann wrote:
Hi,
Are you trying to deploy the instance in a different project (tenant)?
Can you please check if your flavor is publicly available?
Cheers, Oliver
Von meinem iPhone gesendet
Am 28.03.2025 um 12:16 schrieb Francesco Di Nucci <francesco.dinucci@na.infn.it>:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... <https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/>
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>:
* installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com <http://my-gpu-node.example.com>) * VM creation fails with this errors on the controller node
/nova-api.log/
HTTP exception thrown: Flavor gpu_flavor could not be found.
/nova-scheduler.log/
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
/nova-conductor.log/
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
Hi Francesco Do you have alias defined for nvidia-t400 in the compute node nova.conf file? Regards On Fri, Mar 28, 2025 at 11:15 AM Francesco Di Nucci < francesco.dinucci@na.infn.it> wrote:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email: francesco.dinucci@na.infn.it
On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-...
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci < francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b> :
- installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver - set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* - On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* - Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list - Created an image with --property img_hide_hypervisor_id=true - Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com) - VM creation fails with this errors on the controller node
*nova-api.log*
HTTP exception thrown: Flavor gpu_flavor could not be found.
*nova-scheduler.log*
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
*nova-conductor.log*
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email: francesco.dinucci@na.infn.it
Thanks, yes, this is my actual config Compute nova.conf [pci] alias = { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" } device_spec = { "vendor_id": "10de", "product_id": "1ff2" } Controller nova.conf [pci] alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" } [filter_scheduler] available_filters = nova.scheduler.filters.all_filters enabled_filters = PciPassthroughFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter I also tried using type-PF instead of type-PCI, with the same result -- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email:francesco.dinucci@na.infn.it On 28/03/25 15:04, Rambo Rambo wrote:
Hi Francesco
Do you have alias defined for nvidia-t400 in the compute node nova.conf file?
Regards
On Fri, Mar 28, 2025 at 11:15 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... <https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/>
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>:
* installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com <http://my-gpu-node.example.com>) * VM creation fails with this errors on the controller node
/nova-api.log/
HTTP exception thrown: Flavor gpu_flavor could not be found.
/nova-scheduler.log/
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
/nova-conductor.log/
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
On 31/03/2025 08:39, Francesco Di Nucci wrote:
Thanks,
yes, this is my actual config
Compute nova.conf
[pci] alias = { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" } device_spec = { "vendor_id": "10de", "product_id": "1ff2" }
Controller nova.conf
[pci] alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }
if this is your actual config the : instead of the = in your controller config is the problem
[filter_scheduler] available_filters = nova.scheduler.filters.all_filters enabled_filters = PciPassthroughFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter
I also tried using type-PF instead of type-PCI, with the same result
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it On 28/03/25 15:04, Rambo Rambo wrote:
Hi Francesco
Do you have alias defined for nvidia-t400 in the compute node nova.conf file?
Regards
On Fri, Mar 28, 2025 at 11:15 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... <https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/>
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>:
* installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com <http://my-gpu-node.example.com>) * VM creation fails with this errors on the controller node
/nova-api.log/
HTTP exception thrown: Flavor gpu_flavor could not be found.
/nova-scheduler.log/
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
/nova-conductor.log/
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
Thank you, now it's working, it was this "small" detail + I learnt the hard way I had to use dracut to regenerate initramfs... Thanks to all those who helped me :) -- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples Email: francesco.dinucci@na.infn.it On 31/03/25 12:55, Sean Mooney wrote:
On 31/03/2025 08:39, Francesco Di Nucci wrote:
Thanks,
yes, this is my actual config
Compute nova.conf
[pci] alias = { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" } device_spec = { "vendor_id": "10de", "product_id": "1ff2" }
Controller nova.conf
[pci] alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }
if this is your actual config the : instead of the = in your controller config is the problem
[filter_scheduler] available_filters = nova.scheduler.filters.all_filters enabled_filters = PciPassthroughFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter
I also tried using type-PF instead of type-PCI, with the same result
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it On 28/03/25 15:04, Rambo Rambo wrote:
Hi Francesco
Do you have alias defined for nvidia-t400 in the compute node nova.conf file?
Regards
On Fri, Mar 28, 2025 at 11:15 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Thank you,
I tried but the result is still the same, it does not find the flavor, even though it appears on CLI...
$ openstack flavor show gpu_flavor +----------------------------+---------------------------------------+ | Field | Value | +----------------------------+---------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 40 | | id | b729f843-0a66-4e74-9a74-1dc7b1fb246f | | name | gpu_flavor | | os-flavor-access:is_public | True | | properties | pci_passthrough:alias='nvidia-t400:1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 4 | +----------------------------+---------------------------------------+
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
On 28/03/25 10:18, Rambo Rambo wrote:
Hi Francesco
I have added GPU (PCIpassthrough) successfully following: https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-... <https://www.roksblog.de/enabling-gpu-passthrough-in-openstack-kolla-ansible-for-high-performance-workloads/>
compute nova.conf:
[pci]
device_spec = { "vendor_id": "10de", "product_id": "25b6" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "25b6" }
alias = { "vendor_id": "10de", "product_id": "25b6", "device_type": "type-PCI", "name": "nvidia-a2" }
control nova.conf:
alias = { "name": "nvidia-a2", "product_id": "25b6", "vendor_id": "10de", "device_type": "type-PF" }
May be you can review the config.
Regards
On Fri, Mar 28, 2025 at 8:59 AM Francesco Di Nucci <francesco.dinucci@na.infn.it> wrote:
Hi all,
I am trying to setup PCI passthrough with Nova, following guides such as this <https://superuser.openinfra.org/articles/a-comprehensive-guide-to-configuring-gpu-passthrough-in-openstack-for-high-performance-computing/> and this <https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b>:
* installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver * set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* * On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-* * Created a flavor with openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property "pci_passthrough:alias"="nvidia-t400:1" gpu_flavor, it is shown with openstack flavor list * Created an image with --property img_hide_hypervisor_id=true * Tried to create an instance with openstack server create --flavor gpu_flavor --image Almalinux_GPU --key-name "My Key" --network my_network test-gpu (also with --availability-zone nova:my-gpu-node.example.com <http://my-gpu-node.example.com>) * VM creation fails with this errors on the controller node
/nova-api.log/
HTTP exception thrown: Flavor gpu_flavor could not be found.
/nova-scheduler.log/
Filter PciPassthroughFilter returned 0 hosts Filtering removed all hosts for the request with instance ID '131d5c1c-1927-42c0-bb48-618c05d31c2a'. Filter results: ['PciPassthroughFilter: (start: 19, end: 0)']
/nova-conductor.log/
Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
If I understand correctly it's a cascading error generating from the flavor not found, even though it exists... Anyone has encountered something like this/has suggestions? Am I missing some config?
Thanks in advance
-- Francesco Di Nucci System Administrator Compute & Networking Service, INFN Naples
Email:francesco.dinucci@na.infn.it
• installed a GPU on a compute node, configured kernel etc and now it is using the vfio-pci driver • set [pci]/device_spec = { "vendor_id": "10de", "product_id": "1ff2" } on compute node (formerly [pci]/passthrough_whitelist) and restarted openstack-nova-* • On the controller node set [pci]/alias: { "vendor_id":"10de", "product_id":"1ff2", "device_type":"type-PCI", "name":"nvidia-t400" }, [filter_scheduler]/enabled_filters = PciPassthroughFilter, [filter_scheduler]/available_filters = nova.scheduler.filters.all_filters and restarted openstack-nova-*
Did you add the alias line to the compute node as well? The controllers need just the alias, the compute needs the alias and device_spec. https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#configure-... --Dan
participants (6)
-
Dan Smith
-
Francesco Di Nucci
-
Mike Currin
-
Oliver Weinmann
-
Rambo Rambo
-
Sean Mooney