[Openstack-operators] How are folks providing GPU instance types?

Joe Topjian joe at topjian.net
Wed May 11 16:51:09 UTC 2016


Just wanted to add a few notes (I apologize for the brevity):

* The wiki page is indeed the best source of information to get started.
* I found that I didn't have to use EFI-based images. I wonder why that is?
* PCI devices and IDs can be found by running the following on a compute
node:

$ lspci -nn | grep -i nvidia
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107GL [GRID
K1] [10de:0ff2] (rev a1)
85:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107GL [GRID
K1] [10de:0ff2] (rev a1)
86:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107GL [GRID
K1] [10de:0ff2] (rev a1)
87:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107GL [GRID
K1] [10de:0ff2] (rev a1)

In which 10de becomes the vendor ID and 0ff2 becomes the product ID.

* My nova.conf looks like this:

pci_alias={"vendor_id":"10de", "product_id":"0ff2", "name":"gpu"}
scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
scheduler_available_filters=nova.scheduler.filters.all_filters
scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter
scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter

* My /etc/default/grub on the compute node has the following entries:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt
rd.modules-load=vfio-pci"
GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt rd.modules-load=vfio-pci"

* I use the following to create a flavor with access to a single GPU:

nova flavor-create g1.large auto 8192 20 4 --ephemeral 20 --swap 2048
nova flavor-key g1.large set "pci_passthrough:alias"="gpu:1"

For NVIDIA cards in particular, it might take a few attempts to install the
correct driver version, CUDA tools version, etc to get things working
correctly. NVIDIA has a bundle of CUDA examples, one of which is
"/usr/local/cuda-7.5/samples/1_Utilities/deviceQuery". Running this will
confirm if the instance can successfully access the GPU.

Hope this helps!
Joe


On Tue, May 10, 2016 at 8:58 AM, Tomas Vondra <vondra at czech-itc.cz> wrote:

> Nordquist, Peter L <Peter.Nordquist at ...> writes:
>
> > You will also have to enable iommu on your hypervisors to have libvirt
> expose the capability to Nova for PCI
> > passthrough.  I use Centos 7 and had to set 'iommu=pt intel_iommu=on' for
> my kernel parameters.  Along with
> > this, you'll have to start using EFI for your VMs by installing OVMF on
> your Hypervisors and configuring
> > your images appropriately.  I don't have a link handy for this but the
> gist is that Legacy bootloaders have a
> > much more complicated process to initialize the devices being passed to
> the VM where EFI is much easier.
>
> Hi!
> What I found out the hard way under the Xen hypervisor is that the GPU you
> are passing through must not be the primary GPU of the system. Otherwise,
> you get memory corruption as soon as something appears on the console. If
> not sooner :-). Test if your motherboards are capable of running on the
> integrated VGA even if some other graphics card is connected. Or blacklist
> it for the kernel.
> Tomas
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160511/eee12f1a/attachment.html>


More information about the OpenStack-operators mailing list