[openstack-dev] vGPUs support for Nova

Jianghua Wang jianghua.wang at citrix.com
Mon Sep 25 16:59:04 UTC 2017


Sahid,

   Just share some background. XenServer doesn't expose vGPUs as mdev or pci devices. I proposed a spec about one year ago to make fake pci devices so that we can use the existing PCI mechanism to cover vGPUs. But that's not a good design and got strongly objection. After that, we switched to use the resource providers by following the advice from the core team.

Regards,
Jianghua

-----Original Message-----
From: Sahid Orentino Ferdjaoui [mailto:sferdjao at redhat.com] 
Sent: Monday, September 25, 2017 11:01 PM
To: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] vGPUs support for Nova

On Mon, Sep 25, 2017 at 09:29:25AM -0500, Matt Riedemann wrote:
> On 9/25/2017 5:40 AM, Jay Pipes wrote:
> > On 09/25/2017 05:39 AM, Sahid Orentino Ferdjaoui wrote:
> > > There is a desire to expose the vGPUs resources on top of Resource 
> > > Provider which is probably the path we should be going in the long 
> > > term. I was not there for the last PTG and you probably already 
> > > made a decision about moving in that direction anyway. My personal 
> > > feeling is that it is premature.
> > > 
> > > The nested Resource Provider work is not yet feature-complete and 
> > > requires more reviewer attention. If we continue in the direction 
> > > of Resource Provider, it will need at least 2 more releases to 
> > > expose the vGPUs feature and that without the support of NUMA, and 
> > > with the feeling of pushing something which is not stable/production-ready.
> > > 
> > > It's seems safer to first have the Resource Provider work well 
> > > finalized/stabilized to be production-ready. Then on top of 
> > > something stable we could start to migrate our current virt 
> > > specific features like NUMA, CPU Pinning, Huge Pages and finally PCI devices.
> > > 
> > > I'm talking about PCI devices in general because I think we should 
> > > implement the vGPU on top of our /pci framework which is 
> > > production ready and provides the support of NUMA.
> > > 
> > > The hardware vendors building their drivers using mdev and the 
> > > /pci framework currently understand only SRIOV but on a quick 
> > > glance it does not seem complicated to make it support mdev.
> > > 
> > > In the /pci framework we will have to:
> > > 
> > > * Update the PciDevice object fields to accept NULL value for
> > >    'address' and add new field 'uuid'
> > > * Update PciRequest to handle a new tag like 'vgpu_types'
> > > * Update PciDeviceStats to also maintain pool of vGPUs
> > > 
> > > The operators will have to create alias(-es) and configure 
> > > flavors. Basically most of the logic is already implemented and 
> > > the method 'consume_request' is going to select the right vGPUs 
> > > according the request.
> > > 
> > > In /virt we will have to:
> > > 
> > > * Update the field 'pci_passthrough_devices' to also include GPUs
> > >    devices.
> > > * Update attach/detach PCI device to handle vGPUs
> > > 
> > > We have a few people interested in working on it, so we could 
> > > certainly make this feature available for Queen.
> > > 
> > > I can take the lead updating/implementing the PCI and libvirt 
> > > driver part, I'm sure Jianghua Wang will be happy to take the lead 
> > > for the virt XenServer part.
> > > 
> > > And I trust Jay, Stephen and Sylvain to follow the developments.
> > 
> > I understand the desire to get something in to Nova to support 
> > vGPUs, and I understand that the existing /pci modules represent the 
> > fastest/cheapest way to get there.
> > 
> > I won't block you from making any of the above changes, Sahid. I'll 
> > even do my best to review them. However, I will be primarily 
> > focusing this cycle on getting the nested resource providers work 
> > feature-complete for (at least) SR-IOV PF/VF devices.
> > 
> > The decision of whether to allow an approach that adds more to the 
> > existing /pci module is ultimately Matt's.
> > 
> > Best,
> > -jay
> > 
> > ____________________________________________________________________
> > ______ OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: 
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> Nested resource providers is not merged or production ready because we 
> haven't made it a priority. We've certainly talked about it and Jay 
> has had patches proposed for several releases now though.
> 
> Building vGPU support into the existing framework, which only a couple 
> of people understand - certainly not me, might be a short-term gain 
> but is just more technical debt we have to pay off later, and delays 
> any focus on nested resource providers for the wider team.
> 
> At the Queens PTG it was abundantly clear that many features are 
> dependent on nested resource providers, including several 
> networking-related features like bandwidth-based scheduling.
> 
> The priorities for placement/scheduler in Queens are:
> 
> 1. Dan Smith's migration allocations cleanup.
> 2. Alternative hosts for reschedules with cells v2.
> 3. Nested resource providers.
> 
> All of these are in progress and need review.
> 
> I personally don't think we should abandon the plan to implement vGPU 
> support with nested resource providers without first seeing any code 
> changes for it as a proof of concept. It also sounds like we have a 
> pretty simple staggered plan for rolling out vGPU support so it's not 
> very detailed to start. The virt driver reports vGPU inventory and we 
> decorate the details later with traits (which Alex Xu is working on and needs review).
> 
> Sahid, you could certainly implement a separate proof of concept and 
> make that available if the nested resource providers-based change hits 
> major issues or goes far too long and has too much risk, then we have 
> a contingency plan at least. But I don't expect that to get review 
> priority and you'd have to accept that it might not get merged since 
> we want to use nested resource providers.

That seems to be fair, I understand your desire to make the implementation on Resource Provider a priority and I'm with you. In general my preference is to do not stop progress on virt features because we have a new "product" on-going.

> Either way we are going to need solid functional testing and that 
> functional testing should be written against the API as much as 
> possible so that it works regardless of the backend implementation of 
> the feature. One of the big things we failed at in Pike was not doing 
> enough functional testing of move operations with claims in the 
> scheduler earlier in the cycle. That all came in late and we're still fixing bugs as a result.

It's very true and most of the time we are asking our users to be beta-testers, that is one more reason why my preference is for a real deprecation phase.

> If we can get started early on the functional testing for vGPUs, then 
> work both implementations in parallel, we should be able to retain the 
> functional tests and determine which implementation we ultimately need 
> to go with probably sometime in the second milestone.
> 
> --
> 
> Thanks,
> 
> Matt
> 
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


More information about the OpenStack-dev mailing list