<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le jeu. 21 avr. 2022 à 12:26, Sean Mooney <<a href="mailto:smooney@redhat.com">smooney@redhat.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 2022-04-20 at 16:42 +0000, Sigurd Kristian Brinch wrote:<br>

> Hi,<br>

> As far as I can tell, libvirt/KVM supports multiple vGPUs per VM<br>

> (<a href="https://docs.nvidia.com/grid/14.0/grid-vgpu-release-notes-generic-linux-kvm/index.html#multiple-vgpu-support" rel="noreferrer" target="_blank">https://docs.nvidia.com/grid/14.0/grid-vgpu-release-notes-generic-linux-kvm/index.html#multiple-vgpu-support</a>),<br>

> but in OpenStack/Nova it is limited to one vGPU per VM<br>

> (<a href="https://docs.openstack.org/nova/latest/admin/virtual-gpu.html#configure-a-flavor-controller" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/admin/virtual-gpu.html#configure-a-flavor-controller</a>)<br>

> Is there a reason for this limit?<br>

yes nvidia<br>

> What would be needed to enable multiple vGPUs in Nova?<br>

so you can technically do it today if you have 2 vGPU for seperate physical gpu cards<br>

but nvidia do not support multiple vGPUs form the same card.<br>

<br>

nova does not currently provide a way to force the gpu allocation to be from seperate cards.<br>

<br>

<br>

well thats not quite true you could<br>

<br>

you would have to use the named group syntax to request them so instaed of resources:vgpu=2<br>

<br>

you woudl do <br>

<br>

resources_first_gpu_group:VGPU=1 <br>

resources_second_gpu_group:VGPU=1<br>

group_policy=isolate<br>

<br>

the name after resouces_ is arbitray group name provided it conforms to this regex '([a-zA-Z0-9_-]{1,64})?'<br>

<br>

we stongly dislike this approch.<br>

first of all using group_policy=isolate is a gloabl thing meaning that no request groups can come form the same provider<br>

<br>

that means you can not have to sriov VFs from the same physical nic as a result of setting it.<br>

if you dont set group_policy the default is none which means you no longer are guarenteed that they will come form different providres<br>

<br>

so what you woudl need to do is extend placment to support isolating only sepeicic named groups<br>

and then expose that in nova via flavor extra specs which is not particaly good ux as it rather complicated and means you need to<br>

understand how placement works in depth. placement shoudl really be an implemenation detail<br>

i.e.<br>

resources_first_gpu_group:VGPU=1 <br>

resources_second_gpu_group:VGPU=1<br>

group_isolate=first_grpu_group,second_gpu_group;...<br>

<br>

that fixes the confilct with sriov and all other usages of resouce groups like bandwith based qos<br>

<br>

the slightly better approch wouls be to make this simplere to use by doing somtihng liek this<br>

<br>

resources:vgpu=2<br>

vgpu:gpu_selection_policy=isolate<br>

<br>

we would still need the placement feature to isolate by group<br>

but we can hide the detail form the end user with a pre filter in nova<br>

<a href="https://github.com/openstack/nova/blob/eedbff38599addd4574084edac8b111c4e1f244a/nova/scheduler/request_filter.py" rel="noreferrer" target="_blank">https://github.com/openstack/nova/blob/eedbff38599addd4574084edac8b111c4e1f244a/nova/scheduler/request_filter.py</a><br>

which will transfrom the resouce request and split it up into groups automatically<br>

<br>

this is a long way to say that if it was not for limiations in the iommu on nvidia gpus and the fact that they cannot map two vgpus<br>

to from on phsyical gpu to a singel vm this would already work out of hte box wiht just<br>

resources:vgpu=2. perhaps when intel lauch there discret datacenter gpus there vGPU implementaiotn will not have this limiation.<br>

we do not prevent you from requestin 2 vgpus today it will just fail when qemu tries to use them.<br>

<br>

we also have not put the effort into working around the limiation in nvidias hardware since ther drivers also used to block this<br>

until the ampear generation and there has nto been a large request to support multipel vgpus form users.<br>

<br>

ocationally some will ask about it but in general peopel either do full gpu passthough or use 1 vgpu instance.<br>

<br></blockquote><div><br></div><div>Correct, that's why we have this open bug report for a while, but we don't really want to fix for only one vendor.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

hopefully that will help.<br>

you can try the first approch today if you have more then one physical gpu per host<br>

e.g.<br>

resources_first_gpu_group:VGPU=1 <br>

resources_second_gpu_group:VGPU=1<br>

group_policy=isolate<br>

<br>

just be aware of the limiation fo group_policy=isolate<br></blockquote><div><br></div><div>Thanks Sean for explaining how to use a workaround.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

regard<br>

sean<br>

<br>

<br>

> <br>

> BR<br>

> Sigurd<br>

<br>

<br>

</blockquote></div></div>