[dev][nova] Problem about vm migration compatibility check
hi there, In my test environment, i created a vm and configured the cpu with host-model, when I migrate the vm to another host with the same cpu, it failed the migration compatibility check which complains the cpu definition of domain is incompatible with target host cpu. As we know, when the domain configured as above starts, the host-model cpu definition will automatically converted to custom cpu model and with some addtional features that the KVM supported, these addtional features may contains features that the host doesn't support. In the code, the compatibility of the target host is check by calling compareCPU()(libvirt API). The compareCPU() can only recongnize the features probed by cpuid instruction on the host, but it may not recognize the features of cpu definition of domain xml (virsh dumpxml domainname) when the domain running. So the compatibility check will fail when KVM support one or more features which is considerd as disabled by the cpuid instuction. I think we should call compareHypervisorCPU() or something like that (supported by libvirt since v4.4.0) instead of compareCPU() to check the migration compatibility. My test environment is as follow: host cpu: Cascadelake libvirt-6.9 qemu-5.0 host-model cpu: <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> </mode> The hypervisor, umip, pschange-mc-no features block the compatibility check -- Best Regards, Guoyi Tu
hi there,
In my test environment, i created a vm and configured the cpu with host-model, when I migrate the vm to another host with the same cpu, it failed the migration compatibility check which complains the cpu definition of domain is incompatible with target host cpu.
As we know, when the domain configured as above starts, the host-model cpu definition will automatically converted to custom cpu model and with some addtional features that the KVM supported, these addtional features may contains features that the host doesn't support.
In the code, the compatibility of the target host is check by calling compareCPU()(libvirt API). The compareCPU() can only recongnize the features probed by cpuid instruction on the host, but it may not recognize the features of cpu definition of domain xml (virsh dumpxml domainname) when the domain running. So the compatibility check will fail when KVM support one or more features which is considerd as disabled by the cpuid instuction.
I think we should call compareHypervisorCPU() or something like that (supported by libvirt since v4.4.0) instead of compareCPU() to check the migration compatibility.
On 06/04/2021 13:00, Guoyi Tu wrote: there are patches already for review to move to the newer cpu apis. https://review.opendev.org/c/openstack/nova/+/762330 that uses baseline_hypervisor_cpu and compare_hypervisor_cpu instead of the old functions. this work will likely be resumed now that we are after feature freeze and the recandiates are out but we tend not to merge any large change until the release is done. https://review.opendev.org/c/openstack/nova/+/762330 is not particalarly big but changing how we detct cpu feature is not something that is great to merge durign the RC stablisation period. while this should technically resovle https://bugs.launchpad.net/nova/+bug/1903822 but its not really a bug its paying down technical debt so im not sure this is something we should back port. with that said if you are interested in this you should review that patch.
My test environment is as follow: host cpu: Cascadelake libvirt-6.9 qemu-5.0
host-model cpu: <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> </mode>
The hypervisor, umip, pschange-mc-no features block the compatibility check
participants (2)
-
Guoyi Tu
-
Sean Mooney