Dear Openstack community,

 

I am facing an issue trying to setup neutron with SR-IOV and would like to ask for some help:

openstack server show fault.

 

My environment is Openstack Rocky deployed with kolla-ansible.

 

I have edited the configuration files as suggested by the documentation but for some reason nova can’t find the PCI device for pass-through.

 

This is  my setup

 

[root@zeus-59 ~]# lspci -nn | grep -i mell

88:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]

88:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]

88:00.2 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:00.3 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:00.4 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:00.5 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:00.6 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:00.7 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.2 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.3 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.4 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.5 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.6 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:01.7 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:02.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

88:02.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016]

 

 

sriov_agent.ini in compute node content:

 

[sriov_nic]

physical_device_mappings = sriovtenant1:ens2f0,sriovtenant1:ens2f1

exclude_devices =

 

[securitygroup]

firewall_driver = neutron.agent.firewall.NoopFirewallDriver

 

 

nova.conf (nova-compute):

 

[pci]

passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "1db1" }, { "vendor_id": "15b3", "product_id": "1015", "physical_network": "sriovtenant1" }]

alias = { "vendor_id":"10de", "product_id":"1db1", "device_type":"type-PCI", "name":"nv_v100" }

 

 

ml2_conf.ini (neutron-server):

 

[ml2]

type_drivers = flat,vlan,vxlan

tenant_network_types = vxlan

mechanism_drivers = openvswitch,l2population,sriovnicswitch

extension_drivers = port_security

 

[ml2_type_vlan]

network_vlan_ranges = physnet1, sriovtenant1

 

[ml2_type_flat]

flat_networks = sriovtenant1

 

 

Sriov_agent.ini (compute node):

 

[sriov_nic]

physical_device_mappings = sriovtenant1:ens2f0,sriovtenant1:ens2f1

exclude_devices =

 

[securitygroup]

firewall_driver = neutron.agent.firewall.NoopFirewallDriver

 

 

Create network and subnet:

 

openstack network create \

    --provider-physical-network sriovtenant1 \

    --provider-network-type flat \

    sriovnet1

 

openstack subnet create --network sriovnet1 \

  --subnet-range=10.0.0.0/16 \

  --allocation-pool start=10.0.32.10,end=10.0.32.20 \

  sriovnet1_sub1

 

Create port:

 

openstack port create --network sriovnet1 --vnic-type direct sriovnet1-port1

 

 

Create server:

 

openstack server create --flavor m1.large \

  --image centos7.5-image \

  --nic port-id=373fe020-7b89-40ab-a8e4-76b82cb47490 \

  --key-name mykey \

  --availability-zone nova:zeus-59.localdomain \

  vm-sriov-neutron-1

 

 

The server does not gets created with the following error:

 

| fault                               | {u'message': u'PCI device not found for request ID 419c5fa0-1b7a-4a83-b691-2fcb0fba94cc.', u'code': 500, u'details': u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1940, in _do_build_and_run_instance\n    filter_properties, request_spec)\n  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2229, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'created': u'2019-02-26T04:21:01Z'} |

 

 

Nova logs

 

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager Traceback (most recent call last):

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7778, in _update_available_resource_for_node

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 705, in update_available_resource

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6551, in get_available_resource

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     self._get_pci_passthrough_devices()

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5978, in _get_pci_passthrough_devices

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     pci_info.append(self._get_pcidev_info(name))

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5939, in _get_pcidev_info

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     device.update(_get_device_capabilities(device, address))

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5910, in _get_device_capabilities

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     pcinet_info = self._get_pcinet_info(address)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5853, in _get_pcinet_info

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     virtdev = self._host.device_lookup_by_name(devname)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 873, in device_lookup_by_name

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     return self.get_connection().nodeDeviceLookupByName(name)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     result = proxy_call(self._autowrap, f, *args, **kwargs)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rv = execute(f, *args, **kwargs)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     six.reraise(c, e, tb)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     rv = meth(*args, **kwargs)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4305, in nodeDeviceLookupByName

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)

2019-02-26 15:36:42.595 7 ERROR nova.compute.manager libvirtError: Node device not found: no node device with matching name 'net_enp136s1f2_36_cc_24_fb_76_e3'

 

 

Devices

 

[root@zeus-59 ~]# docker exec nova_libvirt sudo virsh nodedev-list

block_sda_SATA_SSD_7F2F0759012400117253

computer

drm_card0

net_enp136s1_66_0f_f8_42_3b_bb

net_enp136s1f1_82_e4_57_39_4c_bf

net_enp136s1f2_6e_52_31_b0_04_b7????

net_enp136s1f3_e6_be_60_01_f3_9f

net_enp136s1f4_be_d1_8e_9e_46_ef

net_enp136s1f5_4e_61_1c_40_98_dc

net_enp136s1f6_4a_75_ee_f7_c9_68

net_enp136s1f7_de_3e_da_36_48_02

net_enp136s2_3e_dc_23_b4_ca_c4

net_enp136s2f1_c6_12_aa_52_fa_34

net_enp1s0f0_0c_c4_7a_a4_82_ae

net_enp1s0f1_0c_c4_7a_a4_82_af

net_ens1f0_90_e2_ba_03_4c_c8

net_ens1f1_90_e2_ba_03_4c_c9

net_ens2f0_7c_fe_90_12_22_b4

net_ens2f1_7c_fe_90_12_22_b5

net_ens2f2_36_cc_24_fb_76_e3

net_ens2f3_82_42_06_38_9a_b7

net_ens2f4_7e_c0_bb_98_72_f4

net_ens2f5_be_9c_1c_25_ff_0d

net_ens2f6_2e_01_90_f8_44_b5

net_ens2f7_7e_6a_6d_4e_89_1b

pci_0000_00_00_0

pci_0000_00_01_0

pci_0000_00_02_0

pci_0000_00_02_1

pci_0000_00_02_2

pci_0000_00_03_0

pci_0000_00_03_1

pci_0000_00_03_2

...

 

Note: documentation says to use vlan but I am using flat networks

 

Why is nova looking for a device that libvirt doesn’t know about?

 

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.