[openstack-dev] vGPUs support for Nova
jianghua.wang at citrix.com
Mon Sep 25 16:59:04 UTC 2017
Just share some background. XenServer doesn't expose vGPUs as mdev or pci devices. I proposed a spec about one year ago to make fake pci devices so that we can use the existing PCI mechanism to cover vGPUs. But that's not a good design and got strongly objection. After that, we switched to use the resource providers by following the advice from the core team.
From: Sahid Orentino Ferdjaoui [mailto:sferdjao at redhat.com]
Sent: Monday, September 25, 2017 11:01 PM
To: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] vGPUs support for Nova
On Mon, Sep 25, 2017 at 09:29:25AM -0500, Matt Riedemann wrote:
> On 9/25/2017 5:40 AM, Jay Pipes wrote:
> > On 09/25/2017 05:39 AM, Sahid Orentino Ferdjaoui wrote:
> > > There is a desire to expose the vGPUs resources on top of Resource
> > > Provider which is probably the path we should be going in the long
> > > term. I was not there for the last PTG and you probably already
> > > made a decision about moving in that direction anyway. My personal
> > > feeling is that it is premature.
> > >
> > > The nested Resource Provider work is not yet feature-complete and
> > > requires more reviewer attention. If we continue in the direction
> > > of Resource Provider, it will need at least 2 more releases to
> > > expose the vGPUs feature and that without the support of NUMA, and
> > > with the feeling of pushing something which is not stable/production-ready.
> > >
> > > It's seems safer to first have the Resource Provider work well
> > > finalized/stabilized to be production-ready. Then on top of
> > > something stable we could start to migrate our current virt
> > > specific features like NUMA, CPU Pinning, Huge Pages and finally PCI devices.
> > >
> > > I'm talking about PCI devices in general because I think we should
> > > implement the vGPU on top of our /pci framework which is
> > > production ready and provides the support of NUMA.
> > >
> > > The hardware vendors building their drivers using mdev and the
> > > /pci framework currently understand only SRIOV but on a quick
> > > glance it does not seem complicated to make it support mdev.
> > >
> > > In the /pci framework we will have to:
> > >
> > > * Update the PciDevice object fields to accept NULL value for
> > > 'address' and add new field 'uuid'
> > > * Update PciRequest to handle a new tag like 'vgpu_types'
> > > * Update PciDeviceStats to also maintain pool of vGPUs
> > >
> > > The operators will have to create alias(-es) and configure
> > > flavors. Basically most of the logic is already implemented and
> > > the method 'consume_request' is going to select the right vGPUs
> > > according the request.
> > >
> > > In /virt we will have to:
> > >
> > > * Update the field 'pci_passthrough_devices' to also include GPUs
> > > devices.
> > > * Update attach/detach PCI device to handle vGPUs
> > >
> > > We have a few people interested in working on it, so we could
> > > certainly make this feature available for Queen.
> > >
> > > I can take the lead updating/implementing the PCI and libvirt
> > > driver part, I'm sure Jianghua Wang will be happy to take the lead
> > > for the virt XenServer part.
> > >
> > > And I trust Jay, Stephen and Sylvain to follow the developments.
> > I understand the desire to get something in to Nova to support
> > vGPUs, and I understand that the existing /pci modules represent the
> > fastest/cheapest way to get there.
> > I won't block you from making any of the above changes, Sahid. I'll
> > even do my best to review them. However, I will be primarily
> > focusing this cycle on getting the nested resource providers work
> > feature-complete for (at least) SR-IOV PF/VF devices.
> > The decision of whether to allow an approach that adds more to the
> > existing /pci module is ultimately Matt's.
> > Best,
> > -jay
> > ____________________________________________________________________
> > ______ OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Nested resource providers is not merged or production ready because we
> haven't made it a priority. We've certainly talked about it and Jay
> has had patches proposed for several releases now though.
> Building vGPU support into the existing framework, which only a couple
> of people understand - certainly not me, might be a short-term gain
> but is just more technical debt we have to pay off later, and delays
> any focus on nested resource providers for the wider team.
> At the Queens PTG it was abundantly clear that many features are
> dependent on nested resource providers, including several
> networking-related features like bandwidth-based scheduling.
> The priorities for placement/scheduler in Queens are:
> 1. Dan Smith's migration allocations cleanup.
> 2. Alternative hosts for reschedules with cells v2.
> 3. Nested resource providers.
> All of these are in progress and need review.
> I personally don't think we should abandon the plan to implement vGPU
> support with nested resource providers without first seeing any code
> changes for it as a proof of concept. It also sounds like we have a
> pretty simple staggered plan for rolling out vGPU support so it's not
> very detailed to start. The virt driver reports vGPU inventory and we
> decorate the details later with traits (which Alex Xu is working on and needs review).
> Sahid, you could certainly implement a separate proof of concept and
> make that available if the nested resource providers-based change hits
> major issues or goes far too long and has too much risk, then we have
> a contingency plan at least. But I don't expect that to get review
> priority and you'd have to accept that it might not get merged since
> we want to use nested resource providers.
That seems to be fair, I understand your desire to make the implementation on Resource Provider a priority and I'm with you. In general my preference is to do not stop progress on virt features because we have a new "product" on-going.
> Either way we are going to need solid functional testing and that
> functional testing should be written against the API as much as
> possible so that it works regardless of the backend implementation of
> the feature. One of the big things we failed at in Pike was not doing
> enough functional testing of move operations with claims in the
> scheduler earlier in the cycle. That all came in late and we're still fixing bugs as a result.
It's very true and most of the time we are asking our users to be beta-testers, that is one more reason why my preference is for a real deprecation phase.
> If we can get started early on the functional testing for vGPUs, then
> work both implementations in parallel, we should be able to retain the
> functional tests and determine which implementation we ultimately need
> to go with probably sometime in the second milestone.
> ____ OpenStack Development Mailing List (not for usage questions)
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
More information about the OpenStack-dev