<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto">Thank you!<div><br></div><div>That is what I’m also trying to do to give each gpu card to each vm. I do have exact same setting in my nova.conf. What version of libvirt are you running? </div><div><br></div><div>Did you install any special nvidia driver etc on your compute node for passthrough (I doubt because it straightforward). </div><div><br></div><div>Do you have any NUMA setting in your flavor or compute?<br><br><div dir="ltr">Sent from my iPhone</div><div dir="ltr"><br><blockquote type="cite">On Jan 20, 2022, at 2:52 AM, Massimo Sgaravatto <massimo.sgaravatto@gmail.com> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr"><div>Hi Satish</div><div><br></div><div>I am not able to understand what is wrong with your environment, but I can describe my setting.</div><div><br></div><div>I have a compute node with 4 Tesla V100S.</div><div>They have the same vendor-id (10de) and the same product id (13d6) [*]</div><div>In nova.conf I defined this stuff in the [pci] section:</div><div><br></div><div>[pci]<br>passthrough_whitelist = {"vendor_id":"10de"}<br></div><div>alias={"name":"V100","product_id":"1df6","vendor_id":"10de","device_type":"type-PCI"}<br></div><div><br></div><div><br></div><div>I then created a flavor with this property:</div><div><br></div><div>pci_passthrough:alias='V100:1'<br></div><div><br></div><div>Using this flavor I can instantiate  4 VMs: each one can see a single V100</div><div><br></div><div>Hope this helps</div><div><br></div><div>Cheers, Massimo</div><div><br></div><div><br></div><div>[*]</div><div># lspci -nnk -d 10de:<br>60:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 32GB] [10de:1df6] (rev a1)<br>        Subsystem: NVIDIA Corporation Device [10de:13d6]<br>      Kernel driver in use: vfio-pci<br>        Kernel modules: nouveau<br>61:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 32GB] [10de:1df6] (rev a1)<br>        Subsystem: NVIDIA Corporation Device [10de:13d6]<br>      Kernel driver in use: vfio-pci<br>        Kernel modules: nouveau<br>da:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 32GB] [10de:1df6] (rev a1)<br>        Subsystem: NVIDIA Corporation Device [10de:13d6]<br>      Kernel driver in use: vfio-pci<br>        Kernel modules: nouveau<br>db:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 32GB] [10de:1df6] (rev a1)<br>        Subsystem: NVIDIA Corporation Device [10de:13d6]<br>      Kernel driver in use: vfio-pci<br>        Kernel modules: nouveau<br>[root@cld-np-gpu-01 ~]# <br></div><div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jan 19, 2022 at 10:28 PM Satish Patel <<a href="mailto:satish.txt@gmail.com">satish.txt@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Massimo,<br>

<br>

Ignore my last email, my requirement is to have a single VM with a<br>

single GPU ("tesla-v100:1")  but I would like to create a second VM on<br>

the same compute node which uses the second GPU but I am getting the<br>

following error when I create a second VM and vm error out. looks like<br>

it's not allowing me to create a second vm and bind to a second GPU<br>

card.<br>

<br>

error : virDomainDefDuplicateHostdevInfoValidate:1082 : XML error:<br>

Hostdev already exists in the domain configuration<br>

<br>

On Wed, Jan 19, 2022 at 3:10 PM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

><br>

> should i need to create a flavor to target both GPU. is it possible to<br>

> have single flavor cover both GPU because end users don't understand<br>

> which flavor to use.<br>

><br>

> On Wed, Jan 19, 2022 at 1:54 AM Massimo Sgaravatto<br>

> <<a href="mailto:massimo.sgaravatto@gmail.com" target="_blank">massimo.sgaravatto@gmail.com</a>> wrote:<br>

> ><br>

> > If I am not wrong those are 2 GPUs<br>

> ><br>

> > "tesla-v100:1" means 1 GPU<br>

> ><br>

> > So e.g. a flavor with "pci_passthrough:alias": "tesla-v100:2"} will be used to create an instance with 2 GPUs<br>

> ><br>

> > Cheers, Massimo<br>

> ><br>

> > On Tue, Jan 18, 2022 at 11:35 PM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

> >><br>

> >> Thank you for the information.  I have a quick question.<br>

> >><br>

> >> [root@gpu01 ~]# lspci | grep -i nv<br>

> >> 5e:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100S PCIe<br>

> >> 32GB] (rev a1)<br>

> >> d8:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100S PCIe<br>

> >> 32GB] (rev a1)<br>

> >><br>

> >> In the above output showing two cards does that mean they are physical<br>

> >> two or just BUS representation.<br>

> >><br>

> >> Also i have the following entry in openstack flavor, does :1 means<br>

> >> first GPU card?<br>

> >><br>

> >> {"gpu-node": "true", "pci_passthrough:alias": "tesla-v100:1"}<br>

> >><br>

> >><br>

> >><br>

> >><br>

> >><br>

> >><br>

> >> On Tue, Jan 18, 2022 at 5:55 AM António Paulo <<a href="mailto:antonio.paulo@cern.ch" target="_blank">antonio.paulo@cern.ch</a>> wrote:<br>

> >> ><br>

> >> > Hey Satish, Gustavo,<br>

> >> ><br>

> >> > Just to clarify a bit on point 3, you will have to buy a vGPU license<br>

> >> > per card and this gives you access to all the downloads you need through<br>

> >> > NVIDIA's web dashboard -- both the host and guest drivers as well as the<br>

> >> > license server setup files.<br>

> >> ><br>

> >> > Cheers,<br>

> >> > António<br>

> >> ><br>

> >> > On 18/01/22 02:46, Satish Patel wrote:<br>

> >> > > Thank you so much! This is what I was looking for. It is very odd that<br>

> >> > > we buy a pricey card but then we have to buy a license to make those<br>

> >> > > features available.<br>

> >> > ><br>

> >> > > On Mon, Jan 17, 2022 at 2:07 PM Gustavo Faganello Santos<br>

> >> > > <<a href="mailto:gustavofaganello.santos@windriver.com" target="_blank">gustavofaganello.santos@windriver.com</a>> wrote:<br>

> >> > >><br>

> >> > >> Hello, Satish.<br>

> >> > >><br>

> >> > >> I've been working with vGPU lately and I believe I can answer your<br>

> >> > >> questions:<br>

> >> > >><br>

> >> > >> 1. As you pointed out in question #2, the pci-passthrough will allocate<br>

> >> > >> the entire physical GPU to one single guest VM, while vGPU allows you to<br>

> >> > >> spawn from 1 to several VMs using the same physical GPU, depending on<br>

> >> > >> the vGPU type you choose (check NVIDIA docs to see which vGPU types the<br>

> >> > >> Tesla V100 supports and their properties);<br>

> >> > >> 2. Correct;<br>

> >> > >> 3. To use vGPU, you need vGPU drivers installed on the platform where<br>

> >> > >> your deployment of OpenStack is running AND in the VMs, so there are two<br>

> >> > >> drivers to be installed in order to use the feature. I believe both of<br>

> >> > >> them have to be purchased from NVIDIA in order to be used, and you would<br>

> >> > >> also have to deploy an NVIDIA licensing server in order to validate the<br>

> >> > >> licenses of the drivers running in the VMs.<br>

> >> > >> 4. You can see what the instructions are for each of these scenarios in<br>

> >> > >> [1] and [2].<br>

> >> > >><br>

> >> > >> There is also extensive documentation on vGPU at NVIDIA's website [3].<br>

> >> > >><br>

> >> > >> [1] <a href="https://docs.openstack.org/nova/wallaby/admin/virtual-gpu.html" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/wallaby/admin/virtual-gpu.html</a><br>

> >> > >> [2] <a href="https://docs.openstack.org/nova/wallaby/admin/pci-passthrough.html" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/wallaby/admin/pci-passthrough.html</a><br>

> >> > >> [3] <a href="https://docs.nvidia.com/grid/13.0/index.html" rel="noreferrer" target="_blank">https://docs.nvidia.com/grid/13.0/index.html</a><br>

> >> > >><br>

> >> > >> Regards,<br>

> >> > >> Gustavo.<br>

> >> > >><br>

> >> > >> On 17/01/2022 14:41, Satish Patel wrote:<br>

> >> > >>> [Please note: This e-mail is from an EXTERNAL e-mail address]<br>

> >> > >>><br>

> >> > >>> Folk,<br>

> >> > >>><br>

> >> > >>> We have Tesla V100 32G GPU and I’m trying to configure with openstack wallaby. This is first time dealing with GPU so I have couple of question.<br>

> >> > >>><br>

> >> > >>> 1. What is the difference between passthrough vs vGPU? I did google but not very clear yet.<br>

> >> > >>> 2. If I configure it passthrough then does it only work with single VM ? ( I meant whole GPU will get allocate to single VM correct?<br>

> >> > >>> 3. Also some document saying Tesla v100 support vGPU but some folks saying you need license. I have no idea where to get that license. What is the deal here?<br>

> >> > >>> 3. What are the config difference between configure this card with passthrough vs vGPU?<br>

> >> > >>><br>

> >> > >>><br>

> >> > >>> Currently I configure it with passthrough based one one article and I am able to spun up with and I can see nvidia card exposed to vm. (I used iommu and vfio based driver) so if this card support vGPU then do I need iommu and vfio or some other driver to make it virtualize ?<br>

> >> > >>><br>

> >> > >>> Sent from my iPhone<br>

> >> > >>><br>

> >> > ><br>

> >><br>

</blockquote></div></div>

</div></blockquote></div></body></html>