device compatibility interface for live migration with assigned devices

Jiri Pirko jiri at mellanox.com
Wed Aug 5 10:53:19 UTC 2020


Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:
>On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
>> 
>> On 2020/8/5 下午3:56, Jiri Pirko wrote:
>> > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
>> > > On 2020/8/5 上午10:16, Yan Zhao wrote:
>> > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
>> > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
>> > > > > > [sorry about not chiming in earlier]
>> > > > > > 
>> > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
>> > > > > > Yan Zhao <yan.y.zhao at intel.com> wrote:
>> > > > > > 
>> > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
>> > > > > > (...)
>> > > > > > 
>> > > > > > > > Based on the feedback we've received, the previously proposed interface
>> > > > > > > > is not viable.  I think there's agreement that the user needs to be
>> > > > > > > > able to parse and interpret the version information.  Using json seems
>> > > > > > > > viable, but I don't know if it's the best option.  Is there any
>> > > > > > > > precedent of markup strings returned via sysfs we could follow?
>> > > > > > I don't think encoding complex information in a sysfs file is a viable
>> > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
>> > > > > > 
>> > > > > > "Attributes should be ASCII text files, preferably with only one value
>> > > > > > per file. It is noted that it may not be efficient to contain only one
>> > > > > > value per file, so it is socially acceptable to express an array of
>> > > > > > values of the same type.
>> > > > > > Mixing types, expressing multiple lines of data, and doing fancy
>> > > > > > formatting of data is heavily frowned upon."
>> > > > > > 
>> > > > > > Even though this is an older file, I think these restrictions still
>> > > > > > apply.
>> > > > > +1, that's another reason why devlink(netlink) is better.
>> > > > > 
>> > > > hi Jason,
>> > > > do you have any materials or sample code about devlink, so we can have a good
>> > > > study of it?
>> > > > I found some kernel docs about it but my preliminary study didn't show me the
>> > > > advantage of devlink.
>> > > 
>> > > CC Jiri and Parav for a better answer for this.
>> > > 
>> > > My understanding is that the following advantages are obvious (as I replied
>> > > in another thread):
>> > > 
>> > > - existing users (NIC, crypto, SCSI, ib), mature and stable
>> > > - much better error reporting (ext_ack other than string or errno)
>> > > - namespace aware
>> > > - do not couple with kobject
>> > Jason, what is your use case?
>> 
>> 
>> I think the use case is to report device compatibility for live migration.
>> Yan proposed a simple sysfs based migration version first, but it looks not
>> sufficient and something based on JSON is discussed.
>> 
>> Yan, can you help to summarize the discussion so far for Jiri as a
>> reference?
>> 
>yes.
>we are currently defining an device live migration compatibility
>interface in order to let user space like openstack and libvirt knows
>which two devices are live migration compatible.
>currently the devices include mdev (a kernel emulated virtual device)
>and physical devices (e.g.  a VF of a PCI SRIOV device).
>
>the attributes we want user space to compare including
>common attribues:
>    device_api: vfio-pci, vfio-ccw...
>    mdev_type: mdev type of mdev or similar signature for physical device
>               It specifies a device's hardware capability. e.g.
>	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
>	       device.
>    software_version: device driver's version.
>               in <major>.<minor>[.bugfix] scheme, where there is no
>	       compatibility across major versions, minor versions have
>	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
>	       bugfix version number indicates some degree of internal
>	       improvement that is not visible to the user in terms of
>	       features or compatibility,
>
>vendor specific attributes: each vendor may define different attributes
>   device id : device id of a physical devices or mdev's parent pci device.
>               it could be equal to pci id for pci devices
>   aggregator: used together with mdev_type. e.g. aggregator=2 together
>               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
>	       graphics device.
>   remote_url: for a local NVMe VF, it may be configured with a remote
>               url of a remote storage and all data is stored in the
>	       remote side specified by the remote url.
>   ...
>
>Comparing those attributes by user space alone is not an easy job, as it
>can't simply assume an equal relationship between source attributes and
>target attributes. e.g.
>for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
>gen9), it actually could find a compatible device of
>mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
>if mdev_type of i915-GVTg_V5_4 is not available in the target machine.
>
>So, in our current proposal, we want to create two sysfs attributes
>under a device sysfs node.
>/sys/<path to device>/migration/self
>/sys/<path to device>/migration/compatible
>
>#cat /sys/<path to device>/migration/self
>device_type=vfio_pci
>mdev_type=i915-GVTg_V5_4
>device_id=8086591d
>aggregator=2
>software_version=1.0.0
>
>#cat /sys/<path to device>/migration/compatible
>device_type=vfio_pci
>mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
>device_id=8086591d
>aggregator={val1}/2
>software_version=1.0.0
>
>The /sys/<path to device>/migration/self specifies self attributes of
>a device.
>The /sys/<path to device>/migration/compatible specifies the list of
>compatible devices of a device. as in the example, compatible devices
>could have
>	device_type == vfio_pci &&
>	device_id == 8086591d   &&
>	software_version == 1.0.0 &&
>        (
>	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
>	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
>	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
>	)
>
>by comparing whether a target device is in compatible list of source
>device, the user space can know whether a two devices are live migration
>compatible.
>
>Additional notes:
>1)software_version in the compatible list may not be necessary as it
>already has a major.minor.bugfix scheme.
>2)for vendor attribute like remote_url, it may not be statically
>assigned and could be changed with a device interface.
>
>So, as Cornelia pointed that it's not good to use complex format in
>a sysfs attribute, we'd like to know whether there're other good ways to
>our use case, e.g. splitting a single attribute to multiple simple sysfs
>attributes as what Cornelia suggested or devlink that Jason has strongly
>recommended.

Hi Yan.

Thanks for the explanation, I'm still fuzzy about the details.
Anyway, I suggest you to check "devlink dev info" command we have
implemented for multiple drivers. You can try netdevsim to test this.
I think that the info you need to expose might be put there.

Devlink creates instance per-device. Specific device driver calls into
devlink core to create the instance.  What device do you have? What
driver is it handled by?


>
>Thanks
>Yan
>
>
>



More information about the openstack-discuss mailing list