Tesla V100 32G GPU with openstack

Massimo Sgaravatto massimo.sgaravatto at gmail.com
Thu Jan 20 07:51:54 UTC 2022


Hi Satish

I am not able to understand what is wrong with your environment, but I can
describe my setting.

I have a compute node with 4 Tesla V100S.
They have the same vendor-id (10de) and the same product id (13d6) [*]
In nova.conf I defined this stuff in the [pci] section:

[pci]
passthrough_whitelist = {"vendor_id":"10de"}
alias={"name":"V100","product_id":"1df6","vendor_id":"10de","device_type":"type-PCI"}


I then created a flavor with this property:

pci_passthrough:alias='V100:1'

Using this flavor I can instantiate  4 VMs: each one can see a single V100

Hope this helps

Cheers, Massimo


[*]
# lspci -nnk -d 10de:
60:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe
32GB] [10de:1df6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:13d6]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
61:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe
32GB] [10de:1df6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:13d6]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
da:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe
32GB] [10de:1df6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:13d6]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
db:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe
32GB] [10de:1df6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:13d6]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
[root at cld-np-gpu-01 ~]#


On Wed, Jan 19, 2022 at 10:28 PM Satish Patel <satish.txt at gmail.com> wrote:

> Hi Massimo,
>
> Ignore my last email, my requirement is to have a single VM with a
> single GPU ("tesla-v100:1")  but I would like to create a second VM on
> the same compute node which uses the second GPU but I am getting the
> following error when I create a second VM and vm error out. looks like
> it's not allowing me to create a second vm and bind to a second GPU
> card.
>
> error : virDomainDefDuplicateHostdevInfoValidate:1082 : XML error:
> Hostdev already exists in the domain configuration
>
> On Wed, Jan 19, 2022 at 3:10 PM Satish Patel <satish.txt at gmail.com> wrote:
> >
> > should i need to create a flavor to target both GPU. is it possible to
> > have single flavor cover both GPU because end users don't understand
> > which flavor to use.
> >
> > On Wed, Jan 19, 2022 at 1:54 AM Massimo Sgaravatto
> > <massimo.sgaravatto at gmail.com> wrote:
> > >
> > > If I am not wrong those are 2 GPUs
> > >
> > > "tesla-v100:1" means 1 GPU
> > >
> > > So e.g. a flavor with "pci_passthrough:alias": "tesla-v100:2"} will be
> used to create an instance with 2 GPUs
> > >
> > > Cheers, Massimo
> > >
> > > On Tue, Jan 18, 2022 at 11:35 PM Satish Patel <satish.txt at gmail.com>
> wrote:
> > >>
> > >> Thank you for the information.  I have a quick question.
> > >>
> > >> [root at gpu01 ~]# lspci | grep -i nv
> > >> 5e:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100S PCIe
> > >> 32GB] (rev a1)
> > >> d8:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100S PCIe
> > >> 32GB] (rev a1)
> > >>
> > >> In the above output showing two cards does that mean they are physical
> > >> two or just BUS representation.
> > >>
> > >> Also i have the following entry in openstack flavor, does :1 means
> > >> first GPU card?
> > >>
> > >> {"gpu-node": "true", "pci_passthrough:alias": "tesla-v100:1"}
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Tue, Jan 18, 2022 at 5:55 AM António Paulo <antonio.paulo at cern.ch>
> wrote:
> > >> >
> > >> > Hey Satish, Gustavo,
> > >> >
> > >> > Just to clarify a bit on point 3, you will have to buy a vGPU
> license
> > >> > per card and this gives you access to all the downloads you need
> through
> > >> > NVIDIA's web dashboard -- both the host and guest drivers as well
> as the
> > >> > license server setup files.
> > >> >
> > >> > Cheers,
> > >> > António
> > >> >
> > >> > On 18/01/22 02:46, Satish Patel wrote:
> > >> > > Thank you so much! This is what I was looking for. It is very odd
> that
> > >> > > we buy a pricey card but then we have to buy a license to make
> those
> > >> > > features available.
> > >> > >
> > >> > > On Mon, Jan 17, 2022 at 2:07 PM Gustavo Faganello Santos
> > >> > > <gustavofaganello.santos at windriver.com> wrote:
> > >> > >>
> > >> > >> Hello, Satish.
> > >> > >>
> > >> > >> I've been working with vGPU lately and I believe I can answer
> your
> > >> > >> questions:
> > >> > >>
> > >> > >> 1. As you pointed out in question #2, the pci-passthrough will
> allocate
> > >> > >> the entire physical GPU to one single guest VM, while vGPU
> allows you to
> > >> > >> spawn from 1 to several VMs using the same physical GPU,
> depending on
> > >> > >> the vGPU type you choose (check NVIDIA docs to see which vGPU
> types the
> > >> > >> Tesla V100 supports and their properties);
> > >> > >> 2. Correct;
> > >> > >> 3. To use vGPU, you need vGPU drivers installed on the platform
> where
> > >> > >> your deployment of OpenStack is running AND in the VMs, so there
> are two
> > >> > >> drivers to be installed in order to use the feature. I believe
> both of
> > >> > >> them have to be purchased from NVIDIA in order to be used, and
> you would
> > >> > >> also have to deploy an NVIDIA licensing server in order to
> validate the
> > >> > >> licenses of the drivers running in the VMs.
> > >> > >> 4. You can see what the instructions are for each of these
> scenarios in
> > >> > >> [1] and [2].
> > >> > >>
> > >> > >> There is also extensive documentation on vGPU at NVIDIA's
> website [3].
> > >> > >>
> > >> > >> [1]
> https://docs.openstack.org/nova/wallaby/admin/virtual-gpu.html
> > >> > >> [2]
> https://docs.openstack.org/nova/wallaby/admin/pci-passthrough.html
> > >> > >> [3] https://docs.nvidia.com/grid/13.0/index.html
> > >> > >>
> > >> > >> Regards,
> > >> > >> Gustavo.
> > >> > >>
> > >> > >> On 17/01/2022 14:41, Satish Patel wrote:
> > >> > >>> [Please note: This e-mail is from an EXTERNAL e-mail address]
> > >> > >>>
> > >> > >>> Folk,
> > >> > >>>
> > >> > >>> We have Tesla V100 32G GPU and I’m trying to configure with
> openstack wallaby. This is first time dealing with GPU so I have couple of
> question.
> > >> > >>>
> > >> > >>> 1. What is the difference between passthrough vs vGPU? I did
> google but not very clear yet.
> > >> > >>> 2. If I configure it passthrough then does it only work with
> single VM ? ( I meant whole GPU will get allocate to single VM correct?
> > >> > >>> 3. Also some document saying Tesla v100 support vGPU but some
> folks saying you need license. I have no idea where to get that license.
> What is the deal here?
> > >> > >>> 3. What are the config difference between configure this card
> with passthrough vs vGPU?
> > >> > >>>
> > >> > >>>
> > >> > >>> Currently I configure it with passthrough based one one article
> and I am able to spun up with and I can see nvidia card exposed to vm. (I
> used iommu and vfio based driver) so if this card support vGPU then do I
> need iommu and vfio or some other driver to make it virtualize ?
> > >> > >>>
> > >> > >>> Sent from my iPhone
> > >> > >>>
> > >> > >
> > >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220120/50a4a3fa/attachment-0001.htm>


More information about the openstack-discuss mailing list