device compatibility interface for live migration with assigned devices

Cornelia Huck cohuck at redhat.com
Mon Aug 17 06:38:28 UTC 2020


On Thu, 13 Aug 2020 15:02:53 -0400
Eric Farman <farman at linux.ibm.com> wrote:

> On 8/13/20 11:33 AM, Cornelia Huck wrote:
> > On Fri, 7 Aug 2020 13:59:42 +0200
> > Cornelia Huck <cohuck at redhat.com> wrote:
> >   
> >> On Wed, 05 Aug 2020 12:35:01 +0100
> >> Sean Mooney <smooney at redhat.com> wrote:
> >>  
> >>> On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:    
> >>>> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:      
> >>
> >> (...)
> >>  
> >>>>>    software_version: device driver's version.
> >>>>>               in <major>.<minor>[.bugfix] scheme, where there is no
> >>>>> 	       compatibility across major versions, minor versions have
> >>>>> 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> >>>>> 	       bugfix version number indicates some degree of internal
> >>>>> 	       improvement that is not visible to the user in terms of
> >>>>> 	       features or compatibility,
> >>>>>
> >>>>> vendor specific attributes: each vendor may define different attributes
> >>>>>   device id : device id of a physical devices or mdev's parent pci device.
> >>>>>               it could be equal to pci id for pci devices
> >>>>>   aggregator: used together with mdev_type. e.g. aggregator=2 together
> >>>>>               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> >>>>> 	       graphics device.
> >>>>>   remote_url: for a local NVMe VF, it may be configured with a remote
> >>>>>               url of a remote storage and all data is stored in the
> >>>>> 	       remote side specified by the remote url.
> >>>>>   ...      
> >>> just a minor not that i find ^ much more simmple to understand then
> >>> the current proposal with self and compatiable.
> >>> if i have well defiend attibute that i can parse and understand that allow
> >>> me to calulate the what is and is not compatible that is likely going to
> >>> more useful as you wont have to keep maintianing a list of other compatible
> >>> devices every time a new sku is released.
> >>>
> >>> in anycase thank for actully shareing ^ as it make it simpler to reson about what
> >>> you have previously proposed.    
> >>
> >> So, what would be the most helpful format? A 'software_version' field
> >> that follows the conventions outlined above, and other (possibly
> >> optional) fields that have to match?  
> > 
> > Just to get a different perspective, I've been trying to come up with
> > what would be useful for a very different kind of device, namely
> > vfio-ccw. (Adding Eric to cc: for that.)
> > 
> > software_version makes sense for everybody, so it should be a standard
> > attribute.
> > 
> > For the vfio-ccw type, we have only one vendor driver (vfio-ccw_IO).
> > 
> > Given a subchannel A, we want to make sure that subchannel B has a
> > reasonable chance of being compatible. I guess that means:
> > 
> > - same subchannel type (I/O)
> > - same chpid type (e.g. all FICON; I assume there are no 'mixed' setups
> >   -- Eric?)  
> 
> Correct.
> 
> > - same number of chpids? Maybe we can live without that and just inject
> >   some machine checks, I don't know. Same chpid numbers is something we
> >   cannot guarantee, especially if we want to migrate cross-CEC (to
> >   another machine.)  
> 
> I think we'd live without it, because I wouldn't expect it to be
> consistent between systems.

Yes, and the guest needs to be able to deal with changing path
configurations anyway.

> 
> > 
> > Other possibly interesting information is not available at the
> > subchannel level (vfio-ccw is a subchannel driver.)  
> 
> I presume you're alluding to the DASD uid (dasdinfo -x) here?

Yes, or the even more basic Sense ID information.

> 
> > 
> > So, looking at a concrete subchannel on one of my machines, it would
> > look something like the following:
> > 
> > <common>
> > software_version=1.0.0
> > type=vfio-ccw          <-- would be vfio-pci on the example above
> > <vfio-ccw specific>
> > subchannel_type=0
> > <vfio-ccw_IO specific>
> > chpid_type=0x1a
> > chpid_mask=0xf0        <-- not sure if needed/wanted

Let's just drop the chpid_mask here.

> > 
> > Does that make sense?

Would be interesting if someone could come up with some possible
information for a third type of device.




More information about the openstack-discuss mailing list