Re: 答复: Experience with VGPUs

Dmitriy Rabotyagov noonedeadpunk at gmail.com
Fri Jan 13 20:06:21 UTC 2023


To have that said, deb/rpm packages they are providing doesn't help much,
as:
* There is no repo for them, so you need to download them manually from
enterprise portal
* They can't be upgraded anyway, as driver version is part of the package
name. And each package conflicts with any another one. So you need to
explicitly remove old package and only then install new one. And yes, you
must stop all VMs before upgrading driver and no, you can't live migrate
GPU mdev devices due to that now being implemented in qemu. So
deb/rpm/generic driver doesn't matter at the end tbh.


пт, 13 янв. 2023 г., 20:56 Cedric <yipikai7 at gmail.com>:

>
> Ended up with the very same conclusions than Dimitry regarding the use of
> Nvidia Vgrid for the VGPU use case with Nova, it works pretty well but:
>
> - respecting the licensing model as operationnal constraints, note that
> guests need to reach a license server in order to get a token (could be via
> the Nvidia SaaS service or on-prem)
> - drivers for both guest and hypervisor are not easy to implement and
> maintain on large scale. A year ago, hypervisors drivers were not packaged
> to Debian/Ubuntu, but builded though a bash script, thus requiering
> additional automatisation work and careful attention regarding kernel
> update/reboot of Nova hypervisors.
>
> Cheers
>
>
> On Fri, Jan 13, 2023 at 4:21 PM Dmitriy Rabotyagov <
> noonedeadpunk at gmail.com> wrote:
> >
> > You are saying that, like Nvidia GRID drivers are open-sourced while
> > in fact they're super far from being that. In order to download
> > drivers not only for hypervisors, but also for guest VMs you need to
> > have an account in their Enterprise Portal. It took me roughly 6 weeks
> > of discussions with hardware vendors and Nvidia support to get a
> > proper account there. And that happened only after applying for their
> > Partner Network (NPN).
> > That still doesn't solve the issue of how to provide drivers to
> > guests, except pre-build a series of images with these drivers
> > pre-installed (we ended up with making a DIB element for that [1]).
> > Not saying about the need to distribute license tokens for guests and
> > the whole mess with compatibility between hypervisor and guest drivers
> > (as guest driver can't be newer then host one, and HVs can't be too
> > new either).
> >
> > It's not that I'm protecting AMD, but just saying that Nvidia is not
> > that straightforward either, and at least on paper AMD vGPUs look
> > easier both for operators and end-users.
> >
> > [1] https://github.com/citynetwork/dib-elements/tree/main/nvgrid
> >
> > >
> > > As for AMD cards, AMD stated that some of their MI series card
> supports SR-IOV for vGPUs. However, those drivers are never open source or
> provided closed source to public, only large cloud providers are able to
> get them. So I don't really recommend getting AMD cards for vGPU unless you
> are able to get support from them.
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230113/a6a9181c/attachment.htm>


More information about the openstack-discuss mailing list