31 Aug
2020
31 Aug
'20
12:07 p.m.
On 2020/8/21 下午10:52, Cornelia Huck wrote: > On Fri, 21 Aug 2020 11:14:41 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> On 2020/8/20 下午8:27, Cornelia Huck wrote: >>> On Wed, 19 Aug 2020 17:28:38 +0800 >>> Jason Wang <jasowang@redhat.com> wrote: >>> >>>> On 2020/8/19 下午4:13, Yan Zhao wrote: >>>>> On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote: >>>>>> On 2020/8/19 下午2:59, Yan Zhao wrote: >>>>>>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote: >>>>>>>> On 2020/8/19 上午11:30, Yan Zhao wrote: >>>>>>>>> hi All, >>>>>>>>> could we decide that sysfs is the interface that every VFIO vendor driver >>>>>>>>> needs to provide in order to support vfio live migration, otherwise the >>>>>>>>> userspace management tool would not list the device into the compatible >>>>>>>>> list? >>>>>>>>> >>>>>>>>> if that's true, let's move to the standardizing of the sysfs interface. >>>>>>>>> (1) content >>>>>>>>> common part: (must) >>>>>>>>> - software_version: (in major.minor.bugfix scheme) >>>>>>>> This can not work for devices whose features can be negotiated/advertised >>>>>>>> independently. (E.g virtio devices) >>> I thought the 'software_version' was supposed to describe kind of a >>> 'protocol version' for the data we transmit? I.e., you add a new field, >>> you bump the version number. >> >> Ok, but since we mandate backward compatibility of uABI, is this really >> worth to have a version for sysfs? (Searching on sysfs shows no examples >> like this) > I was not thinking about the sysfs interface, but rather about the data > that is sent over while migrating. E.g. we find out that sending some > auxiliary data is a good idea and bump to version 1.1.0; version 1.0.0 > cannot deal with the extra data, but version 1.1.0 can deal with the > older data stream. > > (...) Well, I think what data to transmit during migration is the duty of qemu not kernel. And I suspect the idea of reading opaque data (with version) from kernel and transmit them to dest is the best approach. > >>>>>>>>> - device_api: vfio-pci or vfio-ccw ... >>>>>>>>> - type: mdev type for mdev device or >>>>>>>>> a signature for physical device which is a counterpart for >>>>>>>>> mdev type. >>>>>>>>> >>>>>>>>> device api specific part: (must) >>>>>>>>> - pci id: pci id of mdev parent device or pci id of physical pci >>>>>>>>> device (device_api is vfio-pci)API here. >>>>>>>> So this assumes a PCI device which is probably not true. >>>>>>>> >>>>>>> for device_api of vfio-pci, why it's not true? >>>>>>> >>>>>>> for vfio-ccw, it's subchannel_type. >>>>>> Ok but having two different attributes for the same file is not good idea. >>>>>> How mgmt know there will be a 3rd type? >>>>> that's why some attributes need to be common. e.g. >>>>> device_api: it's common because mgmt need to know it's a pci device or a >>>>> ccw device. and the api type is already defined vfio.h. >>>>> (The field is agreed by and actually suggested by Alex in previous mail) >>>>> type: mdev_type for mdev. if mgmt does not understand it, it would not >>>>> be able to create one compatible mdev device. >>>>> software_version: mgmt can compare the major and minor if it understands >>>>> this fields. >>>> I think it would be helpful if you can describe how mgmt is expected to >>>> work step by step with the proposed sysfs API. This can help people to >>>> understand. >>> My proposal would be: >>> - check that device_api matches >>> - check possible device_api specific attributes >>> - check that type matches [I don't think the combination of mdev types >>> and another attribute to determine compatibility is a good idea; >> >> Any reason for this? Actually if we only use mdev type to detect the >> compatibility, it would be much more easier. Otherwise, we are actually >> re-inventing mdev types. >> >> E.g can we have the same mdev types with different device_api and other >> attributes? > In the end, the mdev type is represented as a string; but I'm not sure > we can expect that two types with the same name, but a different > device_api are related in any way. > > If we e.g. compare vfio-pci and vfio-ccw, they are fundamentally > different. > > I was mostly concerned about the aggregation proposal, where type A + > aggregation value b might be compatible with type B + aggregation value > a. Yes, that looks pretty complicated. > >> >>> actually, the current proposal confuses me every time I look at it] >>> - check that software_version is compatible, assuming semantic >>> versioning >>> - check possible type-specific attributes >> >> I'm not sure if this is too complicated. And I suspect there will be >> vendor specific attributes: >> >> - for compatibility check: I think we should either modeling everything >> via mdev type or making it totally vendor specific. Having something in >> the middle will bring a lot of burden > FWIW, I'm for a strict match on mdev type, and flexibility in per-type > attributes. I'm not sure whether the above flexibility can work better than encoding them to mdev type. If we really want ultra flexibility, we need making the compatibility check totally vendor specific. > >> - for provisioning: it's still not clear. As shown in this proposal, for >> NVME we may need to set remote_url, but unless there will be a subclass >> (NVME) in the mdev (which I guess not), we can't prevent vendor from >> using another attribute name, in this case, tricks like attributes >> iteration in some sub directory won't work. So even if we had some >> common API for compatibility check, the provisioning API is still vendor >> specific ... > Yes, I'm not sure how to deal with the "same thing for different > vendors" problem. We can try to make sure that in-kernel drivers play > nicely, but not much more. Then it's actually a subclass of mdev I guess in the future. Thanks