[Openstack-operators] PCI Passthrough issues

Blair Bethwaite blair.bethwaite at gmail.com
Wed Jul 6 14:35:33 UTC 2016


Hi Jon,

Do you have the nouveau driver/module loaded in the host by any
chance? If so, blacklist, reboot, repeat.

Whilst we're talking about this. Has anyone had any luck doing this
with hosts having a PCI-e switch across multiple GPUs?

Cheers,

On 6 July 2016 at 23:27, Jonathan D. Proulx <jon at csail.mit.edu> wrote:
> Hi All,
>
> Trying to spass through some Nvidia K80 GPUs to soem instance and have
> gotten to the place where Nova seems to be doing the right thing gpu
> instances scheduled on the 1 gpu hypervisor I have and for inside the
> VM I see:
>
> root at gpu-x1:~# lspci | grep -i k80
> 00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
>
> And I can install nvdia-361 driver and get
>
> # ls /dev/nvidia*
> /dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools
>
> Once I load up cuda-7.5 and build the exmaples none fo the run
> claiming there's no cuda device.
>
> # ./matrixMul
> [Matrix Multiply Using CUDA] - Starting...
> cudaGetDevice returned error no CUDA-capable device is detected (code 38), line(396)
> cudaGetDeviceProperties returned error no CUDA-capable device is detected (code 38), line(409)
> MatrixA(160,160), MatrixB(320,160)
> cudaMalloc d_A returned error no CUDA-capable device is detected (code 38), line(164)
>
> I'm not familiar with cuda really but I did get some example code
> running on the physical system for burn in over the weekend (sicne
> reinstaleld so no nvidia driver on hypervisor).
>
> Following various online examples  for setting up pass through I set
> the kernel boot line on the hypervisor to:
>
> # cat /proc/cmdline
> BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0 console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci nosplash nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci nomdmonddf nomdmonisw
>
> Puzzled that I apparently have the device but it is apparently
> nonfunctional, where do I even look from here?
>
> -Jon
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo



More information about the OpenStack-operators mailing list