Hi, it might be necessary to blocklist a hypervisor to get rid of stale locks. You and me discussed that in a thread [1] a while ago. ;-) Or did you already check locks and it's a different issue? Regards, Eugen [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-February/032274... Zitat von Satish Patel <satish.txt@gmail.com>:
Folks,
We had ceph storage failure and finally its up and running and in HEALTH_OK state but now I am not able to bring up openstack vms. I am running the Zed release of openstack. I have noticed a very odd error on nova_libvirt and am not sure what the solution is here.
root@comp1:~# tail -f /var/log/kolla/libvirt/libvirtd.log 2024-07-08 07:22:07.420+0000: 3160: warning : qemuDomainObjBeginJobInternal:934 : Cannot start job (query, none, none) for domain instance-000014b0; current job is (async nested, none, start) owned by (3159 remoteDispatchDomainCreateWithFlags, 0 <null>, 3159 remoteDispatchDomainCreateWithFlags (flags=0x1)) for (1563s, 0s, 1563s) 2024-07-08 07:22:07.420+0000: 3160: error : qemuDomainObjBeginJobInternal:968 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags) 2024-07-08 07:22:37.444+0000: 41620: warning : qemuDomainObjBeginJobInternal:934 : Cannot start job (query, none, none) for domain instance-000015b2; current job is (async nested, none, start) owned by (3162 remoteDispatchDomainCreateWithFlags, 0 <null>, 3162 remoteDispatchDomainCreateWithFlags (flags=0x1)) for (1671s, 0s, 1672s) 2024-07-08 07:22:37.444+0000: 41620: error : qemuDomainObjBeginJobInternal:968 : Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags)
I am noticing the VM in a pause state. Any idea what I can do to fix it. I have rebooted compute nodes also in hope it will free up lock or bad process but no luck
root@comp1:~# docker exec -it nova_libvirt virsh list Id Name State ---------------------------------- 1 instance-00001751 paused 2 instance-000015b2 paused 3 instance-000014b0 paused
When I try to reboot an instance I see the following error. look like qemu acting up..
root@comp1:~# docker exec -it nova_libvirt virsh reboot instance-00001751 error: Failed to reboot domain 'instance-00001751' error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigratePerform3Params)