[ops] QOS on flavor breaking live migration from CentOS 7 to 8

Sean Mooney smooney at redhat.com
Tue Jan 3 19:20:50 UTC 2023


hi yes this is a know issue.

so the simple answer is resize all affected vms instead of live migrating them
the longer answer is we have been dissing this internally at redhat on and off for
some time now.
https://bugs.launchpad.net/nova/+bug/1960840 is one example where this happens.

there is another case for the cpu based quotas that happens when going form rhel/centos 8->9
basically in the 8->9 change the cgroups implemantion changes form v1 to v2
https://bugzilla.redhat.com/show_bug.cgi?id=2035518

when adressing that we did not have a good universal solution for instnace that hardcoded a value that
was incompatible with the cgroups_v2 api in the kernel except resize.

in https://review.opendev.org/c/openstack/nova/+/824048/ we removed automatically adding the
cpu_shares cgroup option to enable booting vms with more then 8 cpus

we did not come up with any option other then resize for the other quotas that were in a similar situation.
the one option that we considerd possibel to do was extend nova-mange to allow the embeded flaour to be updated
this would be similar to what we did to enable the image property to be modifed for chaing machine types.

https://docs.openstack.org/nova/latest/cli/nova-manage.html#image-property-commands

we didcussed at the time that while we did not want to allow falvor extra specs to be modifed we might recondier that
if the quota issue forced our hand or  we had a similar need due to foces beyond our contol. i.e. we needed to provide a way beyond
resize e.g. due ot operating system changes. what make image properties and flavor extra spec different is that image proerties can
only be updated by rebuild which is a destructive operation. extra specs are upsted by resize which is not a destructive operation.
that is one of the reasons we have special considertion to image properties and did not do the same for extra specs.

if we allow the same for flavor extra specs you would still have to stop the instance, make the change and then migrate the instnace
resize automates that so it is generall a better fit. we were also conceren that adding it to nova manage would result in it being abused
to modify instnace in ways that were either invalid for the host(changing the numa toplogy, adding traits/resouce request not trackedcxd in placemnt)
or otherwise break the instnace in weird ways. that could happen via image properites too but its less likely. 



On Tue, 2023-01-03 at 17:25 +0100, Jahson Babel wrote:
> Hello,
> 
> I'm trying to live migrate some VMs from CentOS 7 to Rocky 8.
> Everything run smoothly when there is no extra specs on flavors but 
> things getting more complicated when those are fixed. Especially when 
> using quota:vif_burst for QOS.
> I know that we aren't supposed to use this for QOS now but it's an old 
> cluster and it was done that way at the time. So VMs kinda have all 
> those specs tied to them.
> 
> When live migrate a VM this show up in the nova's logs :
> driver.py _live_migration_operation nova.virt.libvirt.driver  Live 
> Migration failure: internal error: Child process (tc class add dev 
> tapxxxxxxxx-xx parent 1: classid 1:1 htb rate 250000kbps ceil 
> 2000000kbps burst 60000000kb quantum 21333) unexpected exit status 1: 
> Illegal "burst"
> This bug cover the problem : https://bugs.launchpad.net/nova/+bug/1960840
> So it's seems to be a normal behavior. Plus I forgot to mention that I'm 
> on OpenStack Train version and the file mentioned in the launchpad is 
> not present for this version.
> By using Rocky 8 I have to use an updated libvirt that won't accept the 
> burst parameter we used to set. All available versions of libvirt on 
> Rocky 8 have changed behavior concerning the burst parameter.
> 
> I've done some testing to make things works including removing the 
> extra_specs on flavors and in the DB, removing it through libvirt and 
> trying to modify tc rules used by a VM but it didn't worked.
> I have not tried yet to patch Nova or Libvirt but I don't really know 
> where to look for.
> The only thing that did work was to resize the VM to an identical flavor 
> without the extra_specs. But this induce a complete reboot of the VM. I 
> would like, if possible, to be able to live migrate the VMs which is 
> quite easier.
> 
> Is it possible to remove the extra_specs on the VMs and then live 
> migrate ? Or should I just plan to resize/reboot all VMs without those 
> extra_specs ?
> Any advise will be appreciated.
> 
> Thank you for any help,
> Best regards.
> 
> Jahson




More information about the openstack-discuss mailing list