[openstack-dev] vGPUs support for Nova - Implementation
sbauza at redhat.com
Fri Sep 29 12:15:31 UTC 2017
On Fri, Sep 29, 2017 at 2:32 AM, Dan Smith <dms at danplanet.com> wrote:
> In this serie of patches we are generalizing the PCI framework to
>>> handle MDEV devices. We arguing it's a lot of patches but most of them
>>> are small and the logic behind is basically to make it understand two
>>> new fields MDEV_PF and MDEV_VF.
>> That's not really "generalizing the PCI framework to handle MDEV devices"
>> :) More like it's just changing the /pci module to understand a different
>> device management API, but ok.
> Yeah, the series is adding more fields to our PCI structure to allow for
> more variations in the kinds of things we lump into those tables. This is
> my primary complaint with this approach, and has been since the topic first
> came up. I really want to avoid building any more dependency on the
> existing pci-passthrough mechanisms and focus any new effort on using
> resource providers for this. The existing pci-passthrough code is almost
> universally hated, poorly understood and tested, and something we should
> not be further building upon.
> In this serie of patches we make libvirt driver support, as usually,
>>> return resources and attach devices returned by the pci manager. This
>>> part can be reused for Resource Provider.
>> Perhaps, but the idea behind the resource providers framework is to treat
>> devices as generic things. Placement doesn't need to know about the
>> particular device attachment status.
> I quickly went through the patches and left a few comments. The base work
> of pulling some of this out of libvirt is there, but it's all focused on
> the act of populating pci structures from the vgpu information we get from
> libvirt. That code could be made to instead populate a resource inventory,
> but that's about the most of the set that looks applicable to the
> placement-based approach.
I'll review them too.
As mentioned in IRC and the previous ML discussion, my focus is on the
>> nested resource providers work and reviews, along with the other two
>> top-priority scheduler items (move operations and alternate hosts).
>> I'll do my best to look at your patch series, but please note it's lower
>> priority than a number of other items.
> FWIW, I'm not really planning to spend any time reviewing it until/unless
> it is retooled to generate an inventory from the virt driver.
> With the two patches that report vgpus and then create guests with them
> when asked converted to resource providers, I think that would be enough to
> have basic vgpu support immediately. No DB migrations, model changes, etc
> required. After that, helping to get the nested-rps and traits work landed
> gets us the ability to expose attributes of different types of those vgpus
> and opens up a lot of possibilities. IMHO, that's work I'm interested in
That's exactly the things I would like to provide for Queens, so operators
would have a possibility to have flavors asking for vGPU resources in
Queens, even if they couldn't yet ask for a specific VGPU type yet (or
asking to be in the same NUMA cell than the CPU). The latter is definitely
needing to have nested resource providers, but the former (just having vGPU
resource classes provided by the virt driver) is possible for Queens.
> One thing that would be very useful, Sahid, if you could get with Eric
>> Fried (efried) on IRC and discuss with him the "generic device management"
>> system that was discussed at the PTG. It's likely that the /pci module is
>> going to be overhauled in Rocky and it would be good to have the mdev
>> device management API requirements included in that discussion.
> Definitely this.
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev