[nova][ops] Live migration and CPU features

Sean Mooney smooney at redhat.com
Tue Aug 18 15:11:45 UTC 2020


On Tue, 2020-08-18 at 17:01 +0200, Luis Ramirez wrote:
> Hi,
> 
> Try to choose a custom cpu_model that fits into your infra. This should be
> the best approach to avoid this kind of problem. If the performance is not
> an issue for the tenants, KVM64 should be a good election.
you should neve use kvm64 in production
it is not maintained for security vulnerablity e.g. it is never updated with
any fo the feature flag to mitigate security issue like specter ectra.

its perfect for ci and test where you dont contol the underlying cloud
and are using nested virt. its also semi resonable for nested vms
but its not a good choice for the host.

you should either use host-passthough and segreate your host using aggreates or other
means to ensure live migration capavlity or use a custom model. host model is a good default
provided you upgrade all host at the same time and you are ok with the feature set changing.

host model has a 1 way migration proablem where it possible to migrate form old host to new but
not new to old if the vm is hard rebooted in between. so when using host model we still
recommend segrationg host by cpu generation to avoid that.
> 
> Br,
> Luis Rmz <https://www.linkedin.com/in/luisframirez/>
> Blockchain, DevOps & Open Source Cloud Solutions Architect
> ----------------------------------------
> Founder & CEO
> OpenCloud.es <http://www.opencloud.es/>
> luis.ramirez at opencloud.es
> Skype ID: d.overload
> Hangouts: luis.ramirez at opencloud.es
> [image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49 152
> 26917722 / Česká republika: +420 774 274 882
> -----------------------------------------------------
> 
> 
> El mar., 18 ago. 2020 a las 16:55, Belmiro Moreira (<
> moreira.belmiro.email.lists at gmail.com>) escribió:
> 
> > Hi,
> > in our infrastructure we have always compute nodes that need a hardware
> > intervention and as a consequence they are rebooted, bringing a new kernel,
> > kvm, ...
> > 
> > In order to have a good compromise between performance and flexibility
> > (live migration) we have been using "host-model" for the "cpu_mode"
> > configuration of our service VMs. We didn't expect to have CPU
> > compatibility issues because we have the same hardware type per cell.
> > 
> > The problem is that when a compute node is rebooted the instance domain is
> > recreated with the new cpu features that were introduced because of the
> > reboot (using centOS).
> > 
> > If there are new CPU features exposed, this basically blocks live
> > migration to all the non rebooted compute nodes (those cpu features are not
> > exposed, yet). The nova-scheduler doesn't know about them when scheduling
> > the live migration destination.
> > 
> > I wonder how other operators are solving this issue.
> > I don't like stopping OS upgrades.
> > What I'm considering is to define a "custom" cpu_mode for each hardware
> > type.
> > 
> > I would appreciate your comments and learn how you are solving this
> > problem.
> > 
> > Belmiro
> > 
> > 




More information about the openstack-discuss mailing list