device compatibility interface for live migration with assigned devices

Yan Zhao yan.y.zhao at intel.com
Wed Aug 19 03:30:35 UTC 2020


On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:
> Hi Cornelia,
> 
> > From: Cornelia Huck <cohuck at redhat.com>
> > Sent: Tuesday, August 18, 2020 3:07 PM
> > To: Daniel P. Berrangé <berrange at redhat.com>
> > Cc: Jason Wang <jasowang at redhat.com>; Yan Zhao
> > <yan.y.zhao at intel.com>; kvm at vger.kernel.org; libvir-list at redhat.com;
> > qemu-devel at nongnu.org; Kirti Wankhede <kwankhede at nvidia.com>;
> > eauger at redhat.com; xin-ran.wang at intel.com; corbet at lwn.net; openstack-
> > discuss at lists.openstack.org; shaohe.feng at intel.com; kevin.tian at intel.com;
> > Parav Pandit <parav at mellanox.com>; jian-feng.ding at intel.com;
> > dgilbert at redhat.com; zhenyuw at linux.intel.com; hejie.xu at intel.com;
> > bao.yumeng at zte.com.cn; Alex Williamson <alex.williamson at redhat.com>;
> > eskultet at redhat.com; smooney at redhat.com; intel-gvt-
> > dev at lists.freedesktop.org; Jiri Pirko <jiri at mellanox.com>;
> > dinechin at redhat.com; devel at ovirt.org
> > Subject: Re: device compatibility interface for live migration with assigned
> > devices
> > 
> > On Tue, 18 Aug 2020 10:16:28 +0100
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > 
> > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > >
> > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > >
> > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > >
> > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > >
> > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:
> > >
> > > >  we actually can also retrieve the same information through sysfs,
> > > > .e.g
> > > >
> > > >  |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >     |     |--- compatible
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >
> > > >
> > > >  Yes but:
> > > >
> > > >  - You need one file per attribute (one syscall for one attribute)
> > > >  - Attribute is coupled with kobject
> > 
> > Is that really that bad? You have the device with an embedded kobject
> > anyway, and you can just put things into an attribute group?
> > 
> > [Also, I think that self/compatible split in the example makes things
> > needlessly complex. Shouldn't semantic versioning and matching already
> > cover nearly everything? I would expect very few cases that are more
> > complex than that. Maybe the aggregation stuff, but I don't think we need
> > that self/compatible split for that, either.]
> > 
> > > >
> > > >  All of above seems unnecessary.
> > > >
> > > >  Another point, as we discussed in another thread, it's really hard
> > > > to make  sure the above API work for all types of devices and
> > > > frameworks. So having a  vendor specific API looks much better.
> > > >
> > > >  From the POV of userspace mgmt apps doing device compat checking /
> > > > migration,  we certainly do NOT want to use different vendor
> > > > specific APIs. We want to  have an API that can be used / controlled in a
> > standard manner across vendors.
> > > >
> > > >    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> > > >    long debate on sysfs vs devlink). So if we go with sysfs, at least two
> > > >    APIs needs to be supported ...
> > >
> > > NB, I was not questioning devlink vs sysfs directly. If devlink is
> > > related to netlink, I can't say I'm enthusiastic as IMKE sysfs is
> > > easier to deal with. I don't know enough about devlink to have much of an
> > opinion though.
> > > The key point was that I don't want the userspace APIs we need to deal
> > > with to be vendor specific.
> > 
> > From what I've seen of devlink, it seems quite nice; but I understand why
> > sysfs might be easier to deal with (especially as there's likely already a lot of
> > code using it.)
> > 
> > I understand that some users would like devlink because it is already widely
> > used for network drivers (and some others), but I don't think the majority of
> > devices used with vfio are network (although certainly a lot of them are.)
> > 
> > >
> > > What I care about is that we have a *standard* userspace API for
> > > performing device compatibility checking / state migration, for use by
> > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > vendor specific code paths.
> > >
> > > If there is vendor specific stuff on the side, that's fine as we can
> > > ignore that, but the core functionality for device compat / migration
> > > needs to be standardized.
> > 
> > To summarize:
> > - choose one of sysfs or devlink
> > - have a common interface, with a standardized way to add
> >   vendor-specific attributes
> > ?
> 
> Please refer to my previous email which has more example and details.
hi Parav,
the example is based on a new vdpa tool running over netlink, not based
on devlink, right?
For vfio migration compatibility, we have to deal with both mdev and physical
pci devices, I don't think it's a good idea to write a new tool for it, given
we are able to retrieve the same info from sysfs and there's already an
mdevctl from Alex (https://github.com/mdevctl/mdevctl).

hi All,
could we decide that sysfs is the interface that every VFIO vendor driver
needs to provide in order to support vfio live migration, otherwise the
userspace management tool would not list the device into the compatible
list?

if that's true, let's move to the standardizing of the sysfs interface.
(1) content
common part: (must)
   - software_version: (in major.minor.bugfix scheme)
   - device_api: vfio-pci or vfio-ccw ...
   - type: mdev type for mdev device or
           a signature for physical device which is a counterpart for
	   mdev type.

device api specific part: (must)
  - pci id: pci id of mdev parent device or pci id of physical pci
    device (device_api is vfio-pci)
  - subchannel_type (device_api is vfio-ccw) 
 
vendor driver specific part: (optional)
  - aggregator
  - chpid_type
  - remote_url

NOTE: vendors are free to add attributes in this part with a
restriction that this attribute is able to be configured with the same
name in sysfs too. e.g.
for aggregator, there must be a sysfs attribute in device node
/sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
so that the userspace tool is able to configure the target device
according to source device's aggregator attribute.


(2) where and structure
proposal 1:
|- [path to device]
  |--- migration
  |     |--- self
  |     |    |-software_version
  |     |    |-device_api
  |     |    |-type
  |     |    |-[pci_id or subchannel_type]
  |     |    |-<aggregator or chpid_type>
  |     |--- compatible
  |     |    |-software_version
  |     |    |-device_api
  |     |    |-type
  |     |    |-[pci_id or subchannel_type]
  |     |    |-<aggregator or chpid_type>
multiple compatible is allowed.
attributes should be ASCII text files, preferably with only one value
per file.


proposal 2: use bin_attribute.
|- [path to device]
  |--- migration
  |     |--- self
  |     |--- compatible

so we can continue use multiline format. e.g.
cat compatible
  software_version=0.1.0
  device_api=vfio_pci
  type=i915-GVTg_V5_{val1:int:1,2,4,8}
  pci_id=80865963
  aggregator={val1}/2

Thanks
Yan



More information about the openstack-discuss mailing list