NVMe PCI-passthrough
Stephen Finucane
sfinucan at redhat.com
Wed Apr 3 09:20:09 UTC 2019
On Wed, 2019-04-03 at 06:18 +0000, Manuel Sopena Ballesteros wrote:
> Dear Openstack community,
>
> I am trying to attach NVMe drives to my vms and would like to check the configuration I have because for some reason it is not working.
>
> This is my physical host details:
>
> [root at zeus-59 ~]# lspci -nn -D | grep -i ssd
> 0000:03:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:04:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:06:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:07:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:08:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:09:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:84:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:85:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:86:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
> 0000:87:00.0 Non-Volatile memory controller [0108]: Intel Corporation PCIe Data Center SSD [8086:0953] (rev 01)
>
> My idea is to attach either 0000:03:00.0 or 0000:04:00.0 to a vm
>
> This is how I identify the block devices for each pci device:
>
> [root at zeus-59 ~]# ls -l /sys/bus/pci/devices/0000\:03\:00.0/nvme
> total 0
> drwxr-xr-x. 4 root root 0 Apr 3 14:02 nvme0
> [root at zeus-59 ~]# ls -l /sys/bus/pci/devices/0000\:04\:00.0/nvme
> total 0
> drwxr-xr-x. 4 root root 0 Apr 3 14:02 nvme1
>
> I installed the OS on the physical machine leaving those 2 drives intact:
>
> [root at zeus-59 ~]# lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 59.6G 0 disk
> ├─sda1 8:1 0 1G 0 part /boot
> └─sda2 8:2 0 4G 0 part [SWAP]
> sr0 11:0 1 906M 0 rom
> nvme0n1 259:15 0 1.8T 0 disk
> nvme1n1 259:0 0 1.8T 0 disk
> nvme2n1 259:11 0 1.8T 0 disk
> └─nvme2n1p1 259:12 0 1.8T 0 part /var/lib/docker/btrfs
> nvme3n1 259:6 0 1.8T 0 disk
> └─nvme3n1p1 259:7 0 1.8T 0 part
> nvme4n1 259:13 0 1.8T 0 disk
> └─nvme4n1p1 259:14 0 1.8T 0 part
> nvme5n1 259:5 0 1.8T 0 disk
> └─nvme5n1p1 259:8 0 1.8T 0 part
> nvme6n1 259:17 0 1.8T 0 disk
> └─nvme6n1p1 259:18 0 1.8T 0 part
> nvme7n1 259:9 0 1.8T 0 disk
> └─nvme7n1p1 259:10 0 1.8T 0 part
> nvme8n1 259:3 0 1.8T 0 disk
> └─nvme8n1p1 259:4 0 1.8T 0 part
> nvme9n1 259:1 0 1.8T 0 disk
> └─nvme9n1p1 259:2 0 1.8T 0 part
>
> The next thing is to configure Openstack
>
> nova api config
>
> [pci]
> alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"}
>
>
> nova scheduler config
>
> [filter_scheduler]
> enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
> available_filters = nova.scheduler.filters.all_filters
>
>
> nova compute config
>
> [filter_scheduler]
> enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
> available_filters = nova.scheduler.filters.all_filters
>
> [pci]
> passthrough_whitelist = [ {"address":"0000:03:00.0"}, {"address":"0000:04:00.0"} ]
> alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"}
>
>
> Then I create my flavor
> openstack flavor create nvme.small --ram 64000 --disk 10 --vcpus 7 --property "pci_passthrough:alias"="nvme:1"
>
>
> I created the vm towards the host with the free nvme drives
>
> [root at openstack-deployment ~]# openstack server create --flavor nvme.small --image centos7.5-image --nic net-id=hpc --security-group admin --key-name mykey --availability-zone nova:zeus-59.localdomain test_nvme_small
>
>
> Vm creation fails and I can see this error in the nova compute logs
>
> 2019-04-03 16:50:35.512 7 ERROR nova.virt.libvirt.guest [req-8478a99f-bad0-43d5-b405-55c45e8d8cae 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] Error launching a defined domain with XML: <domain type='kvm'>
> <name>instance-00000090</name>
> …
> <devices>
> <emulator>/usr/libexec/qemu-kvm</emulator>
> <disk type='file' device='disk'>
> <driver name='qemu' type='qcow2' cache='none'/>
> <source file='/var/lib/nova/instances/a261fcd4-eca3-4982-8b2a-1df33087ab40/disk'/>
> <target dev='vda' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
> </disk>
> …
> 2019-04-03 16:50:35.513 7 ERROR nova.virt.libvirt.driver [req-8478a99f-bad0-43d5-b405-55c45e8d8cae 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: a261fcd4-eca3-4982-8b2a-1df33087ab40] Failed to start libvirt guest: libvirtError: unsupported configuration: host doesn't support passthrough of host PCI devices
This error suggests you haven't enabled IOMMU. See the troubleshooting
section of this page for more information:
http://ask.xmodulo.com/pci-passthrough-virt-manager.html
If this isn't called out in the docs, it should be.
Stephen
> Any idea or advice would be very helpful
More information about the openstack-discuss
mailing list