-> ----邮件原件-----
发件人: Arne Wiebalck [mailto:Arne.Wiebalck@cern.ch] 发送时间: 2023年1月12日 15:43 收件人: Oliver Weinmann <oliver.weinmann@me.com>; openstack-discuss <openstack-discuss@lists.openstack.org> 主题: Re: Experience with VGPUs
Hi Oliver,
The presentation you linked was only *at* CERN, not *from* CERN (it was during an OpenStack Day we organised here). Sylvain and/or Mohammed may be available to answer the questions you have related to that deck, or also in general for the integration of GPUs.
Now, *at* CERN we also have hypervisors with different GPUs in our fleet, and are also looking into various options how to efficiently provision them: as bare metal, as vGPUs, using MIG support, ... and we have submitted a presentation proposal for the upcoming summit to share our experiences.
If you have very specific questions, we can try to answer them here, but maybe there is interest and it would be more efficient to organize a session/call (e.g. as part of the Openstack Operators activities or the Scientific SIG?) to exchange experiences on GPU integration and answer questions there?
What do you and others think?
Cheers, Arne
________________________________________ From: Oliver Weinmann <oliver.weinmann@me.com> Sent: Thursday, 12 January 2023 07:56 To: openstack-discuss Subject: Experience with VGPUs
Dear All,
we are planning to have a POC on VGPUs in our Openstack cluster. Therefore I have a few questions and generally wanted to ask how well VGPUs are supported in Openstack. The docs, in particular:
https://docs.openstack.org/nova/zed/admin/virtual-gpu.html
explain quite well the general implementation.
But I am more interested in general experience with using VGPUs in Openstack. We currently have a small YOGA cluster, planning to upgrade to Zed soon, with a couple of compute nodes. Currently our users use consumer cards like RTX 3050/3060 on their laptops and the idea would be to provide VGPUs to these users. For this I would like to make a very small POC where we first equip one compute node with an Nvidia GPU. Gladly also a few tips on which card would be a good starting point are highly appreciated. I know this heavily depends on the server hardware but this is something I can figure out later. Also do we need additional software licenses > to run this? I saw this very nice presentation from CERN on VGPUs:
https://indico.cern.ch/event/776411/contributions/3345183/attachments/185162...
In the table they are listing Quadro vDWS licenses. I assume we need these in order to use the cards? Also do we need something like Cyborg for this or is VGPU fully implemented in Nova?
You can try to use Cyborg manage your GPU devices, it also can support list/attach vGPU for an instance, if you want to attach/detach an device from an instance that you should transform your flavor, because the vGPU/GPU info need to be added in flavor now(If you want to use this feature may be need to separate such GPU metadata from flavor, we have discussed in nova team before). I am working in Inspur, in our InCloud OS conduct, we are using Cyborg manage GPU/vGPU, FPGA, QAT etc. devices. And adapted GPU T4/T100 (support vGPU), A100(support mig), I think use Cyborg to better manage local GPU devices, please refer api docs of Cyborg https://docs.openstack.org/api-ref/accelerator/
Best Regards,
Oliver