[Openstack-operators] PCI Passthrough issues

Jonathan D. Proulx jon at csail.mit.edu
Wed Jul 6 13:27:07 UTC 2016


Hi All,

Trying to spass through some Nvidia K80 GPUs to soem instance and have
gotten to the place where Nova seems to be doing the right thing gpu
instances scheduled on the 1 gpu hypervisor I have and for inside the
VM I see:

root at gpu-x1:~# lspci | grep -i k80
00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)

And I can install nvdia-361 driver and get

# ls /dev/nvidia*
/dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools

Once I load up cuda-7.5 and build the exmaples none fo the run
claiming there's no cuda device.

# ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
cudaGetDevice returned error no CUDA-capable device is detected (code 38), line(396)
cudaGetDeviceProperties returned error no CUDA-capable device is detected (code 38), line(409)
MatrixA(160,160), MatrixB(320,160)
cudaMalloc d_A returned error no CUDA-capable device is detected (code 38), line(164)

I'm not familiar with cuda really but I did get some example code
running on the physical system for burn in over the weekend (sicne
reinstaleld so no nvidia driver on hypervisor).

Following various online examples  for setting up pass through I set
the kernel boot line on the hypervisor to:

# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0 console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci nosplash nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci nomdmonddf nomdmonisw

Puzzled that I apparently have the device but it is apparently
nonfunctional, where do I even look from here?

-Jon




More information about the OpenStack-operators mailing list