[dev][nova] Problem about vm migration compatibility check

Sean Mooney smooney at redhat.com
Tue Apr 6 23:02:58 UTC 2021



On 06/04/2021 13:00, Guoyi Tu wrote:
> hi there,
>
> In my test environment, i created a vm and configured the cpu with 
> host-model, when I migrate the vm to another host with the same cpu, 
> it failed the migration compatibility check which complains the cpu 
> definition of domain is incompatible with target host cpu.
>
> As we know, when the domain configured as above starts, the host-model 
> cpu definition will automatically converted to custom cpu model and 
> with some addtional features that the KVM supported, these addtional 
> features may contains features that the host doesn't support.
>
> In the code, the compatibility of the target host is check by calling 
> compareCPU()(libvirt API). The compareCPU() can only recongnize the 
> features probed by cpuid instruction on the host, but it may not 
> recognize the features of cpu definition of domain xml (virsh dumpxml 
> domainname) when the domain running. So the compatibility check will 
> fail when KVM support one or more features which is considerd as 
> disabled by the cpuid instuction.
>
> I think we should call compareHypervisorCPU() or something like that 
> (supported by libvirt since v4.4.0) instead of compareCPU() to check 
> the migration compatibility.
there are patches already for review to move to the newer cpu apis.
https://review.opendev.org/c/openstack/nova/+/762330
that uses baseline_hypervisor_cpu and compare_hypervisor_cpu instead of 
the old functions.
this work will likely be resumed now that we are after feature freeze 
and the recandiates are out but we tend not to merge any large change 
until the release is done.

https://review.opendev.org/c/openstack/nova/+/762330 is not particalarly 
big but changing how we detct cpu feature is not something that is great 
to merge durign the RC stablisation period.
while this should technically resovle 
https://bugs.launchpad.net/nova/+bug/1903822 but its not really a bug 
its paying down technical debt so im not sure this is something we 
should back port.
with that said if you are interested in this you should review that patch.
>
>
> My test environment is as follow:
> host cpu: Cascadelake
> libvirt-6.9
> qemu-5.0
>
> host-model cpu:
>    <mode name='host-model' supported='yes'>
>       <model fallback='forbid'>Cascadelake-Server</model>
>       <vendor>Intel</vendor>
>       <feature policy='require' name='ss'/>
>       <feature policy='require' name='hypervisor'/>
>       <feature policy='require' name='tsc_adjust'/>
>       <feature policy='require' name='umip'/>
>       <feature policy='require' name='pku'/>
>       <feature policy='require' name='md-clear'/>
>       <feature policy='require' name='stibp'/>
>       <feature policy='require' name='arch-capabilities'/>
>       <feature policy='require' name='xsaves'/>
>       <feature policy='require' name='invtsc'/>
>       <feature policy='require' name='rdctl-no'/>
>       <feature policy='require' name='ibrs-all'/>
>       <feature policy='require' name='skip-l1dfl-vmentry'/>
>       <feature policy='require' name='mds-no'/>
>       <feature policy='require' name='pschange-mc-no'/>
>       <feature policy='disable' name='hle'/>
>       <feature policy='disable' name='rtm'/>
>     </mode>
>
>
> The hypervisor, umip, pschange-mc-no features block the compatibility 
> check
>
>




More information about the openstack-discuss mailing list