Le jeu. 21 avr. 2022 à 12:26, Sean Mooney <smooney@redhat.com> a écrit :
On Wed, 2022-04-20 at 16:42 +0000, Sigurd Kristian Brinch wrote:
> Hi,
> As far as I can tell, libvirt/KVM supports multiple vGPUs per VM
> (https://docs.nvidia.com/grid/14.0/grid-vgpu-release-notes-generic-linux-kvm/index.html#multiple-vgpu-support),
> but in OpenStack/Nova it is limited to one vGPU per VM
> (https://docs.openstack.org/nova/latest/admin/virtual-gpu.html#configure-a-flavor-controller)
> Is there a reason for this limit?
yes nvidia
> What would be needed to enable multiple vGPUs in Nova?
so you can technically do it today if you have 2 vGPU for seperate physical gpu cards
but nvidia do not support multiple vGPUs form the same card.

nova does not currently provide a way to force the gpu allocation to be from seperate cards.


well thats not quite true you could

you would have to use the named group syntax to request them so instaed of resources:vgpu=2

you woudl do 

resources_first_gpu_group:VGPU=1 
resources_second_gpu_group:VGPU=1
group_policy=isolate

the name after resouces_ is arbitray group name provided it conforms to this regex '([a-zA-Z0-9_-]{1,64})?'

we stongly dislike this approch.
first of all using group_policy=isolate is a gloabl thing meaning that no request groups can come form the same provider

that means you can not have to sriov VFs from the same physical nic as a result of setting it.
if you dont set group_policy the default is none which means you no longer are guarenteed that they will come form different providres

so what you woudl need to do is extend placment to support isolating only sepeicic named groups
and then expose that in nova via flavor extra specs which is not particaly good ux as it rather complicated and means you need to
understand how placement works in depth. placement shoudl really be an implemenation detail
i.e.
resources_first_gpu_group:VGPU=1
resources_second_gpu_group:VGPU=1
group_isolate=first_grpu_group,second_gpu_group;...

that fixes the confilct with sriov and all other usages of resouce groups like bandwith based qos

the slightly better approch wouls be to make this simplere to use by doing somtihng liek this

resources:vgpu=2
vgpu:gpu_selection_policy=isolate

we would still need the placement feature to isolate by group
but we can hide the detail form the end user with a pre filter in nova
https://github.com/openstack/nova/blob/eedbff38599addd4574084edac8b111c4e1f244a/nova/scheduler/request_filter.py
which will transfrom the resouce request and split it up into groups automatically

this is a long way to say that if it was not for limiations in the iommu on nvidia gpus and the fact that they cannot map two vgpus
to from on phsyical gpu to a singel vm this would already work out of hte box wiht just
resources:vgpu=2. perhaps when intel lauch there discret datacenter gpus there vGPU implementaiotn will not have this limiation.
we do not prevent you from requestin 2 vgpus today it will just fail when qemu tries to use them.

we also have not put the effort into working around the limiation in nvidias hardware since ther drivers also used to block this
until the ampear generation and there has nto been a large request to support multipel vgpus form users.

ocationally some will ask about it but in general peopel either do full gpu passthough or use 1 vgpu instance.


Correct, that's why we have this open bug report for a while, but we don't really want to fix for only one vendor.
 
hopefully that will help.
you can try the first approch today if you have more then one physical gpu per host
e.g.
resources_first_gpu_group:VGPU=1
resources_second_gpu_group:VGPU=1
group_policy=isolate

just be aware of the limiation fo group_policy=isolate

Thanks Sean for explaining how to use a workaround.
 

regard
sean


>
> BR
> Sigurd