Re: 答复: Experience with VGPUs

open infra openinfradn at gmail.com
Wed Jan 18 14:03:22 UTC 2023


On Tue, Jan 17, 2023 at 4:54 PM Dmitriy Rabotyagov <noonedeadpunk at gmail.com>
wrote:

> Oh, wait a second, can you have multiple different types on 1 GPU? As
> I don't think you can, or maybe it's limited to MIG mode only - I'm
> using mostly vGPUs so not 100% sure about MIG mode.
> But eventually on vGPU, once you create 1 type, all others become
> unavailable. So originally each comand like
> # cat
> /sys/bus/pci/devices/0000\:84\:00.1/mdev_supported_types/nvidia-699/available_instances
> 1
> # cat
> /sys/bus/pci/devices/0000\:84\:00.2/mdev_supported_types/nvidia-699/available_instances
> 1
> # cat
> /sys/bus/pci/devices/0000\:84\:00.2/mdev_supported_types/nvidia-700/available_instances
> 1
>
> BUT, once you create an mdev of specific type, rest will not report as
> available anymore.
> # echo ${uuidgen} >
> /sys/bus/pci/devices/0000\:84\:00.1/mdev_supported_types/nvidia-699/create
> # cat
> /sys/bus/pci/devices/0000\:84\:00.1/mdev_supported_types/nvidia-699/available_instances
> 0
> # cat
> /sys/bus/pci/devices/0000\:84\:00.2/mdev_supported_types/nvidia-699/available_instances
> 1
> # cat
> /sys/bus/pci/devices/0000\:84\:00.2/mdev_supported_types/nvidia-700/available_instances
> 0
>
> Please, correct me if I'm wrong here and Nvidia did some changes with
> recent drivers or it's applicable only for vGPUs and it's not a case
> for the MIG mode.
>

I have created A40-24Q instance out of A40 48GB GPU.
But I experience the same.


>
> вт, 17 янв. 2023 г., 03:37 Alex Song (宋文平) <songwenping at inspur.com>:
> >
> >
> > Hi, Ulrich:
> >
> > Sean is expert on VGPU management from nova side. I complete the usage
> steps if you are using Nova to manage MIGs for example:
> > 1. divide the A100(80G) GPUs to 1g.10gb*1+2g.20gb*1+3g.40gb*1(one
> 1g.10gb, one 2g.20gb and one 3g.40gb)
> > 2.add the device config in nova.conf:
> > [devices]
> > enabled_mdev_types = nvidia-699,nvidia-700,nvidia-701
> > [mdev_nvidia-699]
> > device_addresses = 0000:84:00.1
> > [mdev_nvidia-700]
> > device_addresses = 0000:84:00.2
> > [mdev_nvidia-701]
> > device_addresses = 0000:84:00.3
> > 3.config the flavor metadata with VGPU:1 and create vm use the flavor,
> the vm will randomly allocate one MIG from [1g.10gb,2g,20gb,3g.40gb]
> > On step 2, if you have 2 A100(80G) GPUs on one node to use MIG, and the
> other GPU divide to 1g.10gb*3+4g.40gb*1, the config maybe like this:
> > [devices]
> > enabled_mdev_types = nvidia-699,nvidia-700,nvidia-701,nvidia-702
> > [mdev_nvidia-699]
> > device_addresses = 0000:84:00.1, 0000:3b:00.1
> > [mdev_nvidia-700]
> > device_addresses = 0000:84:00.2
> > [mdev_nvidia-701]
> > device_addresses = 0000:84:00.3,
> > [mdev_nvidia-702]
> > device_addresses = 0000:3b:00.3
> >
> > In our product, we use Cyborg to manage the MIGs, from the legacy style
> we also need config the mig like Nova, this is difficult to maintain,
> especially deploy openstack on k8s, so we remove these config and
> automatically discovery the MIGs and support divide MIG by cyborg api. By
> creating device profile with vgpu type traits(nvidia-699, nvidia-700), we
> can appoint MIG size to create VMs.
> >
> > Kind regards
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230118/50754cca/attachment.htm>


More information about the openstack-discuss mailing list