[nova][neutron][kolla]failing to create a vm with SR-IOV (through neutron) - device not found but not listed under libvirt devices

Manuel Sopena Ballesteros manuel.sb at garvan.org.au
Tue Feb 26 06:41:36 UTC 2019


I also tried the devices like this, but still getting the same error message

passthrough_whitelist = [... {"devname": "ens2f0", "physical_network": "sriovtenant1"}, {"devname": "ens2f1", "physical_network": "sriovtenant1"}]

thank you

Manuel

From: Manuel Sopena Ballesteros [mailto:manuel.sb at garvan.org.au]
Sent: Tuesday, February 26, 2019 4:08 PM
To: openstack at lists.openstack.org
Cc: Adrian Chiris (adrianc at mellanox.com)
Subject: failing to create a vm with SR-IOV (through neutron) - device not found but not listed under libvirt devices

Dear Openstack community,

I am facing an issue trying to setup neutron with SR-IOV and would like to ask for some help:
openstack server show fault.

My environment is Openstack Rocky deployed with kolla-ansible.

I have edited the configuration files as suggested by the documentation but for some reason nova can't find the PCI device for pass-through.

This is  my setup

[root at zeus-59 ~]# lspci -nn | grep -i mell
88:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
88:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
88:00.2 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:00.3 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:00.4 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:00.5 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:00.6 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:00.7 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.2 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.3 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.4 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.5 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.6 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:01.7 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:02.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]
88:02.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]


sriov_agent.ini in compute node content:

[sriov_nic]
physical_device_mappings = sriovtenant1:ens2f0,sriovtenant1:ens2f1
exclude_devices =

[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver


nova.conf (nova-compute):

...
[pci]
passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "1db1" }, { "vendor_id": "15b3", "product_id": "1015", "physical_network": "sriovtenant1" }]
alias = { "vendor_id":"10de", "product_id":"1db1", "device_type":"type-PCI", "name":"nv_v100" }


ml2_conf.ini (neutron-server):

[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch,l2population,sriovnicswitch
extension_drivers = port_security

[ml2_type_vlan]
network_vlan_ranges = physnet1, sriovtenant1

[ml2_type_flat]
flat_networks = sriovtenant1
...


Sriov_agent.ini (compute node):

[sriov_nic]
physical_device_mappings = sriovtenant1:ens2f0,sriovtenant1:ens2f1
exclude_devices =

[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver


Create network and subnet:

openstack network create \
    --provider-physical-network sriovtenant1 \
    --provider-network-type flat \
    sriovnet1

openstack subnet create --network sriovnet1 \
  --subnet-range=10.0.0.0/16 \
  --allocation-pool start=10.0.32.10,end=10.0.32.20 \
  sriovnet1_sub1

Create port:

openstack port create --network sriovnet1 --vnic-type direct sriovnet1-port1


Create server:

openstack server create --flavor m1.large \
  --image centos7.5-image \
  --nic port-id=373fe020-7b89-40ab-a8e4-76b82cb47490 \
  --key-name mykey \
  --availability-zone nova:zeus-59.localdomain \
  vm-sriov-neutron-1


The server does not gets created with the following error:

| fault                               | {u'message': u'PCI device not found for request ID 419c5fa0-1b7a-4a83-b691-2fcb0fba94cc.', u'code': 500, u'details': u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1940, in _do_build_and_run_instance\n    filter_properties, request_spec)\n  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2229, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'created': u'2019-02-26T04:21:01Z'} |


Nova logs

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager Traceback (most recent call last):
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7778, in _update_available_resource_for_node
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 705, in update_available_resource
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6551, in get_available_resource
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     self._get_pci_passthrough_devices()
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5978, in _get_pci_passthrough_devices
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     pci_info.append(self._get_pcidev_info(name))
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5939, in _get_pcidev_info
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     device.update(_get_device_capabilities(device, address))
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5910, in _get_device_capabilities
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     pcinet_info = self._get_pcinet_info(address)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5853, in _get_pcinet_info
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     virtdev = self._host.device_lookup_by_name(devname)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 873, in device_lookup_by_name
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     return self.get_connection().nodeDeviceLookupByName(name)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rv = execute(f, *args, **kwargs)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     six.reraise(c, e, tb)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rv = meth(*args, **kwargs)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4305, in nodeDeviceLookupByName
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
2019-02-26 15:36:42.595 7 ERROR nova.compute.manager libvirtError: Node device not found: no node device with matching name 'net_enp136s1f2_36_cc_24_fb_76_e3'


Devices

[root at zeus-59 ~]# docker exec nova_libvirt sudo virsh nodedev-list
block_sda_SATA_SSD_7F2F0759012400117253
computer
drm_card0
net_enp136s1_66_0f_f8_42_3b_bb
net_enp136s1f1_82_e4_57_39_4c_bf
net_enp136s1f2_6e_52_31_b0_04_b7????
net_enp136s1f3_e6_be_60_01_f3_9f
net_enp136s1f4_be_d1_8e_9e_46_ef
net_enp136s1f5_4e_61_1c_40_98_dc
net_enp136s1f6_4a_75_ee_f7_c9_68
net_enp136s1f7_de_3e_da_36_48_02
net_enp136s2_3e_dc_23_b4_ca_c4
net_enp136s2f1_c6_12_aa_52_fa_34
net_enp1s0f0_0c_c4_7a_a4_82_ae
net_enp1s0f1_0c_c4_7a_a4_82_af
net_ens1f0_90_e2_ba_03_4c_c8
net_ens1f1_90_e2_ba_03_4c_c9
net_ens2f0_7c_fe_90_12_22_b4
net_ens2f1_7c_fe_90_12_22_b5
net_ens2f2_36_cc_24_fb_76_e3
net_ens2f3_82_42_06_38_9a_b7
net_ens2f4_7e_c0_bb_98_72_f4
net_ens2f5_be_9c_1c_25_ff_0d
net_ens2f6_2e_01_90_f8_44_b5
net_ens2f7_7e_6a_6d_4e_89_1b
pci_0000_00_00_0
pci_0000_00_01_0
pci_0000_00_02_0
pci_0000_00_02_1
pci_0000_00_02_2
pci_0000_00_03_0
pci_0000_00_03_1
pci_0000_00_03_2
...

Note: documentation says to use vlan but I am using flat networks

Why is nova looking for a device that libvirt doesn't know about?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190226/fa39609d/attachment-0001.html>


More information about the openstack-discuss mailing list