[nova][ops] Live migration and CPU features

Fabian Zimmermann dev.faz at gmail.com
Tue Aug 18 15:06:47 UTC 2020


We are using the "custom"-way. But this does not protect you from all issues.

We had problems with a new cpu-generation not (jet) detected correctly
in an libvirt-version. So libvirt failed back to the "desktop"-cpu of
this newer generation, but didnt support/detect some features =>
blocked live-migration.


Am Di., 18. Aug. 2020 um 16:54 Uhr schrieb Belmiro Moreira
<moreira.belmiro.email.lists at gmail.com>:
> Hi,
> in our infrastructure we have always compute nodes that need a hardware intervention and as a consequence they are rebooted, bringing a new kernel, kvm, ...
> In order to have a good compromise between performance and flexibility (live migration) we have been using "host-model" for the "cpu_mode" configuration of our service VMs. We didn't expect to have CPU compatibility issues because we have the same hardware type per cell.
> The problem is that when a compute node is rebooted the instance domain is recreated with the new cpu features that were introduced because of the reboot (using centOS).
> If there are new CPU features exposed, this basically blocks live migration to all the non rebooted compute nodes (those cpu features are not exposed, yet). The nova-scheduler doesn't know about them when scheduling the live migration destination.
> I wonder how other operators are solving this issue.
> I don't like stopping OS upgrades.
> What I'm considering is to define a "custom" cpu_mode for each hardware type.
> I would appreciate your comments and learn how you are solving this problem.
> Belmiro

More information about the openstack-discuss mailing list