[kolla] Updating libvirt container images without VM downtime
Hi, I'm looking for validation regarding the way Kolla and containers work in regard to upgrading the libvirt containers. Essentially, when you upgrade the libvirt container to a new container image, the container needs to be restarted, thus creating downtime for the VMs. There is no way to avoid this downtime, unless you migrate the VMs to another node and then move them back once the container has restarted, right? -- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
If working properly the restart of the libvirtd container is a non-impacting action for running VMs. The only containers on the hypervisors that have an actual impact on the VMs in the standard setups are the restart of the ovs-vswitchd / ovn-controller / ovsdb containers which result in a small blip of the VM neworks that we've noticed. ________________________________ From: J-P Methot <jp.methot@planethoster.info> Sent: 04 January 2022 19:17 To: openstack-discuss <openstack-discuss@lists.openstack.org> Subject: [kolla] Updating libvirt container images without VM downtime CAUTION: This email originates from outside THG Hi, I'm looking for validation regarding the way Kolla and containers work in regard to upgrading the libvirt containers. Essentially, when you upgrade the libvirt container to a new container image, the container needs to be restarted, thus creating downtime for the VMs. There is no way to avoid this downtime, unless you migrate the VMs to another node and then move them back once the container has restarted, right? -- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc. Danny Webb Senior Linux Systems Administrator The Hut Group<http://www.thehutgroup.com/> Tel: Email: Danny.Webb@thehutgroup.com<mailto:Danny.Webb@thehutgroup.com> For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries. Confidentiality Notice This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company. Encryptions and Viruses Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail. Monitoring Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes. hgvyjuv
This is interesting, because it's the exact opposite of what I'm seeing in my test infrastructure. If I do sudo docker restart 5b737dc80fc5 (the libvirt container), I lose ssh access, web console access and the dashboard's log window becomes empty. When the libvirt container comes back up, if I go inside the container and do virsh list, the return list is empty. As far as I can tell, the VM is effectively shutdown. The openstack dashboard still reports it as up, but any attempt at operations on the VM will force the dashboard to update and show the VM as shut off. As far as I can tell, restarting the docker container for libvirt did kill off my VM here. From what you tell me, this is not the expected behaviour. So, I must ask, why is it acting differently in my environment? Additionally, someone else said that the VMs were running on the host with the process /usr/libexec/qemu-kvm running the VM. On my compute host, qemu-kvm is not present in /usr/libexec. I understand that this could be due to an OS difference, but I thought it would be an important information to add. This is kolla 12.0 on ubuntu 20.04.3 with the openstack wallaby container installed. On 1/5/22 4:30 AM, Danny Webb wrote:
If working properly the restart of the libvirtd container is a non-impacting action for running VMs. The only containers on the hypervisors that have an actual impact on the VMs in the standard setups are the restart of the ovs-vswitchd / ovn-controller / ovsdb containers which result in a small blip of the VM neworks that we've noticed. ------------------------------------------------------------------------ *From:* J-P Methot <jp.methot@planethoster.info> *Sent:* 04 January 2022 19:17 *To:* openstack-discuss <openstack-discuss@lists.openstack.org> *Subject:* [kolla] Updating libvirt container images without VM downtime CAUTION: This email originates from outside THG
Hi,
I'm looking for validation regarding the way Kolla and containers work in regard to upgrading the libvirt containers. Essentially, when you upgrade the libvirt container to a new container image, the container needs to be restarted, thus creating downtime for the VMs. There is no way to avoid this downtime, unless you migrate the VMs to another node and then move them back once the container has restarted, right?
-- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
Danny Webb Senior Linux Systems Administrator The Hut Group <http://www.thehutgroup.com/>
Tel: Email: Danny.Webb@thehutgroup.com <mailto:Danny.Webb@thehutgroup.com>
For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries.
*Confidentiality Notice* This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company.
*Encryptions and Viruses* Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail.
*Monitoring* Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes.
hgvyjuv
-- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
Hello J-P, I believe you must be hitting this critical bug, which was fixed in kolla-ansible 12.2.0: https://bugs.launchpad.net/kolla-ansible/+bug/1941706 I would recommend keeping on top of kolla-ansible updates, at least using tagged releases, which are also published to PyPI. By staying on the initial Wallaby release, you are missing six months of bug fixes. Best wishes, Pierre Riteau (priteau) On Fri, 7 Jan 2022 at 22:27, J-P Methot <jp.methot@planethoster.info> wrote:
This is interesting, because it's the exact opposite of what I'm seeing in my test infrastructure.
If I do sudo docker restart 5b737dc80fc5 (the libvirt container), I lose ssh access, web console access and the dashboard's log window becomes empty. When the libvirt container comes back up, if I go inside the container and do virsh list, the return list is empty. As far as I can tell, the VM is effectively shutdown. The openstack dashboard still reports it as up, but any attempt at operations on the VM will force the dashboard to update and show the VM as shut off.
As far as I can tell, restarting the docker container for libvirt did kill off my VM here. From what you tell me, this is not the expected behaviour. So, I must ask, why is it acting differently in my environment?
Additionally, someone else said that the VMs were running on the host with the process /usr/libexec/qemu-kvm running the VM. On my compute host, qemu-kvm is not present in /usr/libexec. I understand that this could be due to an OS difference, but I thought it would be an important information to add.
This is kolla 12.0 on ubuntu 20.04.3 with the openstack wallaby container installed.
On 1/5/22 4:30 AM, Danny Webb wrote:
If working properly the restart of the libvirtd container is a non-impacting action for running VMs. The only containers on the hypervisors that have an actual impact on the VMs in the standard setups are the restart of the ovs-vswitchd / ovn-controller / ovsdb containers which result in a small blip of the VM neworks that we've noticed. ________________________________ From: J-P Methot <jp.methot@planethoster.info> Sent: 04 January 2022 19:17 To: openstack-discuss <openstack-discuss@lists.openstack.org> Subject: [kolla] Updating libvirt container images without VM downtime
CAUTION: This email originates from outside THG
Hi,
I'm looking for validation regarding the way Kolla and containers work in regard to upgrading the libvirt containers. Essentially, when you upgrade the libvirt container to a new container image, the container needs to be restarted, thus creating downtime for the VMs. There is no way to avoid this downtime, unless you migrate the VMs to another node and then move them back once the container has restarted, right?
-- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
Danny Webb Senior Linux Systems Administrator The Hut Group
Tel: Email: Danny.Webb@thehutgroup.com
For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries.
Confidentiality Notice This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company.
Encryptions and Viruses Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail.
Monitoring Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes.
hgvyjuv
-- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
This is interesting, because it's the exact opposite of what I'm seeing in my test infrastructure.
If I do sudo docker restart 5b737dc80fc5 (the libvirt container), I lose ssh access, web console access and the dashboard's log window becomes empty. When the libvirt container comes back up, if I go inside the container and do virsh list, the return list is empty. As far as I can tell, the VM is effectively shutdown. The openstack dashboard still reports it as up, but any attempt at operations on the VM will force the dashboard to update and show the VM as shut off.
As far as I can tell, restarting the docker container for libvirt did kill off my VM here. From what you tell me, this is not the expected behaviour. So, I must ask, why is it acting differently in my environment? it really should not.
On Fri, 2022-01-07 at 16:22 -0500, J-P Methot wrote: the containe shoudl be runing with pid=host so that teh vm are parented to the host pid 1 not the vm and it whould outlive the container the other thing to be aware if os to be carful of the cgroup behavior as stoping the container estrially if you are maging it with sytemd/podman if concifuged incorrectly can kill all procewss in the sam cgroup which can result in the vms being killed. the qemu process shoudl not be in the docker created cgroups. libvirts behavior also changed dependin on if you have systemd-containerd and systemd-manchiend configured and enabled in the host so you can see the vm showdown behviaor if you install systemd-contianerd on the host and restart the vms. this is because libvirt changes form its legacy direct cgroup interface to use systemd to interact with cgroups. libvirt does not have a upgrade mechanium to go form one to the other so since the exsiting vms are not regesited in systemd it shuts them down since it thinks they should not be running.
Additionally, someone else said that the VMs were running on the host with the process /usr/libexec/qemu-kvm running the VM. On my compute host, qemu-kvm is not present in /usr/libexec. I understand that this could be due to an OS difference, but I thought it would be an important information to add.
This is kolla 12.0 on ubuntu 20.04.3 with the openstack wallaby container installed.
On 1/5/22 4:30 AM, Danny Webb wrote:
If working properly the restart of the libvirtd container is a non-impacting action for running VMs. The only containers on the hypervisors that have an actual impact on the VMs in the standard setups are the restart of the ovs-vswitchd / ovn-controller / ovsdb containers which result in a small blip of the VM neworks that we've noticed. ------------------------------------------------------------------------ *From:* J-P Methot <jp.methot@planethoster.info> *Sent:* 04 January 2022 19:17 *To:* openstack-discuss <openstack-discuss@lists.openstack.org> *Subject:* [kolla] Updating libvirt container images without VM downtime CAUTION: This email originates from outside THG
Hi,
I'm looking for validation regarding the way Kolla and containers work in regard to upgrading the libvirt containers. Essentially, when you upgrade the libvirt container to a new container image, the container needs to be restarted, thus creating downtime for the VMs. There is no way to avoid this downtime, unless you migrate the VMs to another node and then move them back once the container has restarted, right?
-- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.
Danny Webb Senior Linux Systems Administrator The Hut Group <http://www.thehutgroup.com/>
Tel: Email: Danny.Webb@thehutgroup.com <mailto:Danny.Webb@thehutgroup.com>
For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries.
*Confidentiality Notice* This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company.
*Encryptions and Viruses* Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail.
*Monitoring* Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes.
hgvyjuv
participants (4)
-
Danny Webb
-
J-P Methot
-
Pierre Riteau
-
Sean Mooney