[openstack-dev] vGPUs support for Nova - Implementation

Dan Smith dms at danplanet.com
Fri Sep 29 00:32:34 UTC 2017

>> In this serie of patches we are generalizing the PCI framework to
>> handle MDEV devices. We arguing it's a lot of patches but most of them
>> are small and the logic behind is basically to make it understand two
>> new fields MDEV_PF and MDEV_VF.
> That's not really "generalizing the PCI framework to handle MDEV 
> devices" :) More like it's just changing the /pci module to understand a 
> different device management API, but ok.

Yeah, the series is adding more fields to our PCI structure to allow for 
more variations in the kinds of things we lump into those tables. This 
is my primary complaint with this approach, and has been since the topic 
first came up. I really want to avoid building any more dependency on 
the existing pci-passthrough mechanisms and focus any new effort on 
using resource providers for this. The existing pci-passthrough code is 
almost universally hated, poorly understood and tested, and something we 
should not be further building upon.

>> In this serie of patches we make libvirt driver support, as usually,
>> return resources and attach devices returned by the pci manager. This
>> part can be reused for Resource Provider.
> Perhaps, but the idea behind the resource providers framework is to 
> treat devices as generic things. Placement doesn't need to know about 
> the particular device attachment status.

I quickly went through the patches and left a few comments. The base 
work of pulling some of this out of libvirt is there, but it's all 
focused on the act of populating pci structures from the vgpu 
information we get from libvirt. That code could be made to instead 
populate a resource inventory, but that's about the most of the set that 
looks applicable to the placement-based approach.

> As mentioned in IRC and the previous ML discussion, my focus is on the 
> nested resource providers work and reviews, along with the other two 
> top-priority scheduler items (move operations and alternate hosts).
> I'll do my best to look at your patch series, but please note it's lower 
> priority than a number of other items.

FWIW, I'm not really planning to spend any time reviewing it 
until/unless it is retooled to generate an inventory from the virt driver.

With the two patches that report vgpus and then create guests with them 
when asked converted to resource providers, I think that would be enough 
to have basic vgpu support immediately. No DB migrations, model changes, 
etc required. After that, helping to get the nested-rps and traits work 
landed gets us the ability to expose attributes of different types of 
those vgpus and opens up a lot of possibilities. IMHO, that's work I'm 
interested in reviewing.

> One thing that would be very useful, Sahid, if you could get with Eric 
> Fried (efried) on IRC and discuss with him the "generic device 
> management" system that was discussed at the PTG. It's likely that the 
> /pci module is going to be overhauled in Rocky and it would be good to 
> have the mdev device management API requirements included in that 
> discussion.

Definitely this.


