device compatibility interface for live migration with assigned devices

Dr. David Alan Gilbert dgilbert at redhat.com
Wed Aug 5 09:44:23 UTC 2020


* Yan Zhao (yan.y.zhao at intel.com) wrote:
> > > yes, include a device_api field is better.
> > > for mdev, "device_type=vfio-mdev", is it right?
> > 
> > No, vfio-mdev is not a device API, it's the driver that attaches to the
> > mdev bus device to expose it through vfio.  The device_api exposes the
> > actual interface of the vfio device, it's also vfio-pci for typical
> > mdev devices found on x86, but may be vfio-ccw, vfio-ap, etc...  See
> > VFIO_DEVICE_API_PCI_STRING and friends.
> > 
> ok. got it.
> 
> > > > > > 	device_id=8086591d  
> > > > 
> > > > Is device_id interpreted relative to device_type?  How does this
> > > > relate to mdev_type?  If we have an mdev_type, doesn't that fully
> > > > defined the software API?
> > > >   
> > > it's parent pci id for mdev actually.
> >
> > If we need to specify the parent PCI ID then something is fundamentally
> > wrong with the mdev_type.  The mdev_type should define a unique,
> > software compatible interface, regardless of the parent device IDs.  If
> > a i915-GVTg_V5_2 means different things based on the parent device IDs,
> > then then different mdev_types should be reported for those parent
> > devices.
> >
> hmm, then do we allow vendor specific fields?
> or is it a must that a vendor specific field should have corresponding
> vendor attribute?
> 
> another thing is that the definition of mdev_type in GVT only corresponds
> to vGPU computing ability currently,
> e.g. i915-GVTg_V5_2, is 1/2 of a gen9 IGD, i915-GVTg_V4_2 is 1/2 of a
> gen8 IGD.
> It is too coarse-grained to live migration compatibility.

Can you explain why that's too coarse?

Is this because it's too specific (i.e. that a i915-GVTg_V4_2 could be
migrated to a newer device?), or that it's too specific on the exact
sizings (i.e. that there may be multiple different sizes of a gen9)?

Dave

> Do you think we need to update GVT's definition of mdev_type?
> And is there any guide in mdev_type definition?
> 
> > > > > > 	mdev_type=i915-GVTg_V5_2  
> > > > 
> > > > And how are non-mdev devices represented?
> > > >   
> > > non-mdev can opt to not include this field, or as you said below, a
> > > vendor signature. 
> > > 
> > > > > > 	aggregator=1
> > > > > > 	pv_mode="none+ppgtt+context"  
> > > > 
> > > > These are meaningless vendor specific matches afaict.
> > > >   
> > > yes, pv_mode and aggregator are vendor specific fields.
> > > but they are important to decide whether two devices are compatible.
> > > pv_mode means whether a vGPU supports guest paravirtualized api.
> > > "none+ppgtt+context" means guest can not use pv, or use ppgtt mode pv or
> > > use context mode pv.
> > > 
> > > > > > 	interface_version=3  
> > > > 
> > > > Not much granularity here, I prefer Sean's previous
> > > > <major>.<minor>[.bugfix] scheme.
> > > >   
> > > yes, <major>.<minor>[.bugfix] scheme may be better, but I'm not sure if
> > > it works for a complicated scenario.
> > > e.g for pv_mode,
> > > (1) initially,  pv_mode is not supported, so it's pv_mode=none, it's 0.0.0,
> > > (2) then, pv_mode=ppgtt is supported, pv_mode="none+ppgtt", it's 0.1.0,
> > > indicating pv_mode=none can migrate to pv_mode="none+ppgtt", but not vice versa.
> > > (3) later, pv_mode=context is also supported,
> > > pv_mode="none+ppgtt+context", so it's 0.2.0.
> > > 
> > > But if later, pv_mode=ppgtt is removed. pv_mode="none+context", how to
> > > name its version? "none+ppgtt" (0.1.0) is not compatible to
> > > "none+context", but "none+ppgtt+context" (0.2.0) is compatible to
> > > "none+context".
> > 
> > If pv_mode=ppgtt is removed, then the compatible versions would be
> > 0.0.0 or 1.0.0, ie. the major version would be incremented due to
> > feature removal.
> >  
> > > Maintain such scheme is painful to vendor driver.
> > 
> > Migration compatibility is painful, there's no way around that.  I
> > think the version scheme is an attempt to push some of that low level
> > burden on the vendor driver, otherwise the management tools need to
> > work on an ever growing matrix of vendor specific features which is
> > going to become unwieldy and is largely meaningless outside of the
> > vendor driver.  Instead, the vendor driver can make strategic decisions
> > about where to continue to maintain a support burden and make explicit
> > decisions to maintain or break compatibility.  The version scheme is a
> > simplification and abstraction of vendor driver features in order to
> > create a small, logical compatibility matrix.  Compromises necessarily
> > need to be made for that to occur.
> >
> ok. got it.
> 
> > > > > > COMPATIBLE:
> > > > > > 	device_type=pci
> > > > > > 	device_id=8086591d
> > > > > > 	mdev_type=i915-GVTg_V5_{val1:int:1,2,4,8}    
> > > > > this mixed notation will be hard to parse so i would avoid that.  
> > > > 
> > > > Some background, Intel has been proposing aggregation as a solution to
> > > > how we scale mdev devices when hardware exposes large numbers of
> > > > assignable objects that can be composed in essentially arbitrary ways.
> > > > So for instance, if we have a workqueue (wq), we might have an mdev
> > > > type for 1wq, 2wq, 3wq,... Nwq.  It's not really practical to expose a
> > > > discrete mdev type for each of those, so they want to define a base
> > > > type which is composable to other types via this aggregation.  This is
> > > > what this substitution and tagging is attempting to accomplish.  So
> > > > imagine this set of values for cases where it's not practical to unroll
> > > > the values for N discrete types.
> > > >   
> > > > > > 	aggregator={val1}/2  
> > > > 
> > > > So the {val1} above would be substituted here, though an aggregation
> > > > factor of 1/2 is a head scratcher...
> > > >   
> > > > > > 	pv_mode={val2:string:"none+ppgtt","none+context","none+ppgtt+context"}  
> > > > 
> > > > I'm lost on this one though.  I think maybe it's indicating that it's
> > > > compatible with any of these, so do we need to list it?  Couldn't this
> > > > be handled by Sean's version proposal where the minor version
> > > > represents feature compatibility?  
> > > yes, it's indicating that it's compatible with any of these.
> > > Sean's version proposal may also work, but it would be painful for
> > > vendor driver to maintain the versions when multiple similar features
> > > are involved.
> > 
> > This is something vendor drivers need to consider when adding and
> > removing features.
> > 
> > > > > > 	interface_version={val3:int:2,3}  
> > > > 
> > > > What does this turn into in a few years, 2,7,12,23,75,96,...
> > > >   
> > > is a range better?
> > 
> > I was really trying to point out that sparseness becomes an issue if
> > the vendor driver is largely disconnected from how their feature
> > addition and deprecation affects migration support.  Thanks,
> >
> ok. we'll use the x.y.z scheme then.
> 
> Thanks
> Yan
> 
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK




More information about the openstack-discuss mailing list