[ops] QOS on flavor breaking live migration from CentOS 7 to 8

Jahson Babel jahson.babel at cc.in2p3.fr
Wed Jan 4 09:17:06 UTC 2023


Hello,

Alright, thank you for all those pieces of information, past and futur 
with rhel 9. And the history behind this behavior. It's really interesting.
I was hoping a tricky manipulation could have done the trick. But I'm a 
mere fool and at least I know what I have to do now.

Thanks again for your detailed answer.
Have a nice day.

Jahson

On 03/01/2023 20:20, Sean Mooney wrote:
> hi yes this is a know issue.
>
> so the simple answer is resize all affected vms instead of live migrating them
> the longer answer is we have been dissing this internally at redhat on and off for
> some time now.
> https://bugs.launchpad.net/nova/+bug/1960840 is one example where this happens.
>
> there is another case for the cpu based quotas that happens when going form rhel/centos 8->9
> basically in the 8->9 change the cgroups implemantion changes form v1 to v2
> https://bugzilla.redhat.com/show_bug.cgi?id=2035518
>
> when adressing that we did not have a good universal solution for instnace that hardcoded a value that
> was incompatible with the cgroups_v2 api in the kernel except resize.
>
> in https://review.opendev.org/c/openstack/nova/+/824048/ we removed automatically adding the
> cpu_shares cgroup option to enable booting vms with more then 8 cpus
>
> we did not come up with any option other then resize for the other quotas that were in a similar situation.
> the one option that we considerd possibel to do was extend nova-mange to allow the embeded flaour to be updated
> this would be similar to what we did to enable the image property to be modifed for chaing machine types.
>
> https://docs.openstack.org/nova/latest/cli/nova-manage.html#image-property-commands
>
> we didcussed at the time that while we did not want to allow falvor extra specs to be modifed we might recondier that
> if the quota issue forced our hand or  we had a similar need due to foces beyond our contol. i.e. we needed to provide a way beyond
> resize e.g. due ot operating system changes. what make image properties and flavor extra spec different is that image proerties can
> only be updated by rebuild which is a destructive operation. extra specs are upsted by resize which is not a destructive operation.
> that is one of the reasons we have special considertion to image properties and did not do the same for extra specs.
>
> if we allow the same for flavor extra specs you would still have to stop the instance, make the change and then migrate the instnace
> resize automates that so it is generall a better fit. we were also conceren that adding it to nova manage would result in it being abused
> to modify instnace in ways that were either invalid for the host(changing the numa toplogy, adding traits/resouce request not trackedcxd in placemnt)
> or otherwise break the instnace in weird ways. that could happen via image properites too but its less likely.
>
>
>
> On Tue, 2023-01-03 at 17:25 +0100, Jahson Babel wrote:
>> Hello,
>>
>> I'm trying to live migrate some VMs from CentOS 7 to Rocky 8.
>> Everything run smoothly when there is no extra specs on flavors but
>> things getting more complicated when those are fixed. Especially when
>> using quota:vif_burst for QOS.
>> I know that we aren't supposed to use this for QOS now but it's an old
>> cluster and it was done that way at the time. So VMs kinda have all
>> those specs tied to them.
>>
>> When live migrate a VM this show up in the nova's logs :
>> driver.py _live_migration_operation nova.virt.libvirt.driver  Live
>> Migration failure: internal error: Child process (tc class add dev
>> tapxxxxxxxx-xx parent 1: classid 1:1 htb rate 250000kbps ceil
>> 2000000kbps burst 60000000kb quantum 21333) unexpected exit status 1:
>> Illegal "burst"
>> This bug cover the problem : https://bugs.launchpad.net/nova/+bug/1960840
>> So it's seems to be a normal behavior. Plus I forgot to mention that I'm
>> on OpenStack Train version and the file mentioned in the launchpad is
>> not present for this version.
>> By using Rocky 8 I have to use an updated libvirt that won't accept the
>> burst parameter we used to set. All available versions of libvirt on
>> Rocky 8 have changed behavior concerning the burst parameter.
>>
>> I've done some testing to make things works including removing the
>> extra_specs on flavors and in the DB, removing it through libvirt and
>> trying to modify tc rules used by a VM but it didn't worked.
>> I have not tried yet to patch Nova or Libvirt but I don't really know
>> where to look for.
>> The only thing that did work was to resize the VM to an identical flavor
>> without the extra_specs. But this induce a complete reboot of the VM. I
>> would like, if possible, to be able to live migrate the VMs which is
>> quite easier.
>>
>> Is it possible to remove the extra_specs on the VMs and then live
>> migrate ? Or should I just plan to resize/reboot all VMs without those
>> extra_specs ?
>> Any advise will be appreciated.
>>
>> Thank you for any help,
>> Best regards.
>>
>> Jahson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4270 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230104/2623defc/attachment.bin>


More information about the openstack-discuss mailing list