Re: [nova] migration status update too long

27 Nov 2025

      On 27/11/2025 11:57, Sean Mooney wrote:
...
On 27/11/2025 02:46, Nguyễn Hữu Khôi wrote:
...
Hello.
I just select live migrate from horizon without destination. Instance
is on shared storage, I don't use force-complete. It looks like
enable_qemu_monitor_announce_self = true  cause this problem. It is ok
if I change it to false. I use this option on openstack Xena because
without it after live migration I cannot ping or access instances.
This cloud uses OVS. I test with my current cloud 2025.1 which uses
OVN, it seems we don't need this option anymore. Pls correct me if I
am wrong.
i should have also said that this option was needed for ovn in the past 
as well
but as of caracal or so ovn and neuton now supprot multiple logical 
swich ports
before that ovn also sufforted form network downtime in a simialr way to 
ovs escpially
on vlan networks because it woudl not set up any egress flows until we 
activated the prot binding
in post live migrate instead of seting it up in pre-live-migrate like it 
shoudl have.

landed in zed to add the neutron 
supprothttps://review.opendev.org/c/openstack/neutron/+/828455
i dont recally the specific ovn version requried for that optimsation to 
be present but i vaguly have
ovn 23? in my mind but that could be way off.

ovn also has other issues with how it work today under load
https://bugs.launchpad.net/neutron/+bug/2069718

there is a good youtube presentation from the sumit on how ovn can have 
connectivy issues
in larger deployment ro where the reconsitation time for ovn is long 
https://www.youtube.com/watch?v=POjSOxKyrE0
...
it a workaroudn option so it was never requried
it just help mitigate some downtime that could happen in older release 
of openstack
it shoudl not be requried anymore.
but is an unexpected sideeffect although it benine over all
so what happening is when we invoke qemu to send thos RARP packets we 
must be waiting for that
to complete before completing the migration.
i htinki see the problem
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L66...
we are dreictly calling the qemu monitor via the libvirtmon module 
which isa cmodule.
we are expectign that to yeild doe to eventlet but its possibel that 
that call is blocking.
on one hand we proablly shoudl be kicking that off in a background 
tread and jsut movign on with the rest
of post live migration, on the ohter hand this shoudl not really be 
need after antleop or bobcat ish.
so im not sure useful it is to do that change now.
anyway yes truning off the 
enable_qemu_monitor_announce_self workaround in 2025.1 should be safe.
...
Nguyen Huu Khoi
Nguyen Huu Khoi
On Wed, Nov 26, 2025 at 6:23 PM Sean Mooney <smooney@redhat.com> wrote:
...
how did you "complete the migration" was it via the force-comlete 
action?
that can take several minutes to complete at the qemu level as it need
to transfer any outstanding memory or block data.
also what are defining as the migration end time? when the vm on the
dest starts? (if post copy live migration is used the migration is 
still
in-progress when that happens
when the vm on the source is stopped? the vm has not been cleaned up so
the migration is not complete in nova at this point as we need to 
remove any
volume attachments, delete local files and update neutron in post-live
migrate before the migration is actually completed.
it will not be updated to complete until all cleanup on the source host
is also complete
what you are reporting is not obviously indicative of a bug or error.
without more information we cant really help you understand if this is
normal or not.
On 26/11/2025 01:18, Nguyễn Hữu Khôi wrote:
...
Hello.
I am working on instance migration. After I complete the migration,
when I run *|openstack server migration list|* to check its status, it
still shows as /running/. It takes around *12 minutes* before it
updates to /completed /status.
My OPS: 2025.1
Thank you.
Nguyen Huu Khoi