Re: 答复: Experience with VGPUs

Gene Kuo igene at igene.tw
Fri Jan 13 13:07:35 UTC 2023


Hi Oliver,

I had some experience on using Nvidia vGPUs (Tesla P4) in my own OpenStack cluster. The setup is pretty simple, follow the guides from Nvidia to install Linux KVM drivers[1] and OpenStack document[2] for attaching vGPU mdevs to your instances. Licensing is at the client (VM) side and not the server (hypervisor) side. The cards that you mentioned you are using (RTX3050/3060) doesn't support vGPU, there is a list of supported cards listed by Nvidia[3].

For newer cards using MIGs I have no experience but I would expect the overall procedure to be similar.

As for AMD cards, AMD stated that some of their MI series card supports SR-IOV for vGPUs. However, those drivers are never open source or provided closed source to public, only large cloud providers are able to get them. So I don't really recommend getting AMD cards for vGPU unless you are able to get support from them.

Regards,
Gene Kuo

[1] https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html#red-hat-el-kvm-install-configure-vgpu
[2] https://docs.openstack.org/nova/latest/admin/virtual-gpu.html
[3] https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html

------- Original Message -------
On Friday, January 13th, 2023 at 12:03 PM, Brin Zhang <zhangbailin at inspur.com> wrote:


> -> ----邮件原件-----
> 
> > 发件人: Arne Wiebalck [mailto:Arne.Wiebalck at cern.ch]
> > 发送时间: 2023年1月12日 15:43
> > 收件人: Oliver Weinmann oliver.weinmann at me.com; openstack-discuss openstack-discuss at lists.openstack.org
> > 主题: Re: Experience with VGPUs
> > 
> > Hi Oliver,
> > 
> > The presentation you linked was only at CERN, not from CERN (it was during an OpenStack Day we organised here). Sylvain and/or Mohammed may be available to answer the questions you have related to that deck, or also in general for the integration of GPUs.
> 
> > Now, at CERN we also have hypervisors with different GPUs in our fleet, and are also looking into various options how to efficiently provision them:
> > as bare metal, as vGPUs, using MIG support, ... and we have submitted a presentation proposal for the upcoming summit to share our experiences.
> 
> > If you have very specific questions, we can try to answer them here, but maybe there is interest and it would be more efficient to organize a session/call (e.g. as part of the Openstack Operators activities or the Scientific SIG?) to exchange experiences on GPU integration and answer questions there?
> 
> > What do you and others think?
> 
> > Cheers,
> > Arne
> > 
> > ________________________________________
> > From: Oliver Weinmann oliver.weinmann at me.com
> > Sent: Thursday, 12 January 2023 07:56
> > To: openstack-discuss
> > Subject: Experience with VGPUs
> > 
> > Dear All,
> > 
> > we are planning to have a POC on VGPUs in our Openstack cluster. Therefore I have a few questions and generally wanted to ask how well VGPUs are supported in Openstack. The docs, in particular:
> > 
> > https://docs.openstack.org/nova/zed/admin/virtual-gpu.html
> > 
> > explain quite well the general implementation.
> > 
> > But I am more interested in general experience with using VGPUs in Openstack. We currently have a small YOGA cluster, planning to upgrade to Zed soon, with a couple of compute nodes. Currently our users use consumer cards like RTX 3050/3060 on their laptops and the idea would be to provide VGPUs to these users. For this I
> > would like to make a very small POC where we first equip one compute node with an Nvidia GPU. Gladly also a few tips on which card would be a good starting point are highly appreciated. I know this heavily depends on the server hardware but this is something I can figure out later. Also do we need additional software
> > licenses > to run this? I saw this very nice presentation from CERN on VGPUs:
> > 
> > https://indico.cern.ch/event/776411/contributions/3345183/attachments/1851624/3039917/02_-_vGPUs_with_OpenStack_-_Accelerating_Science.pdf
> 
> > In the table they are listing Quadro vDWS licenses. I assume we need these in order to use the cards? Also do we need something like Cyborg for this or is VGPU fully implemented in Nova?
> 
> 
> You can try to use Cyborg manage your GPU devices, it also can support list/attach vGPU for an instance, if you want to attach/detach an device from an instance that you should transform your flavor, because the vGPU/GPU info need to be added in flavor now(If you want to use this feature may be need to separate such GPU metadata from flavor, we have discussed in nova team before).
> I am working in Inspur, in our InCloud OS conduct, we are using Cyborg manage GPU/vGPU, FPGA, QAT etc. devices. And adapted GPU T4/T100 (support vGPU), A100(support mig), I think use Cyborg to better manage local GPU devices, please refer api docs of Cyborg https://docs.openstack.org/api-ref/accelerator/
> 
> > Best Regards,
> 
> > Oliver



More information about the openstack-discuss mailing list