Could enable_qemu_monitor_announce_self blocking be responsible for 12 _minutes_ of delay?  That sounds huge!

Also, can I ask if this is _only_ a problem with the OpenStack status reporting (i.e. "openstack server migration list")?  Or does it also affect the actual liveness of the migrated instance?

(Coincidentally, I am also currently investigating live migration.  I'm seeing a problem where data transfer on an existing connection to the instance is held up for about 12 seconds after the migration has completed.)


On Thu, Nov 27, 2025 at 12:15 PM Sean Mooney <smooney@redhat.com> wrote:


On 27/11/2025 11:57, Sean Mooney wrote:
>
>
> On 27/11/2025 02:46, Nguyễn Hữu Khôi wrote:
>> Hello.
>>
>> I just select live migrate from horizon without destination. Instance
>> is on shared storage, I don't use force-complete. It looks like
>> enable_qemu_monitor_announce_self = true  cause this problem. It is ok
>> if I change it to false. I use this option on openstack Xena because
>> without it after live migration I cannot ping or access instances.
>> This cloud uses OVS. I test with my current cloud 2025.1 which uses
>> OVN, it seems we don't need this option anymore. Pls correct me if I
>> am wrong.
i should have also said that this option was needed for ovn in the past
as well
but as of caracal or so ovn and neuton now supprot multiple logical
swich ports
before that ovn also sufforted form network downtime in a simialr way to
ovs escpially
on vlan networks because it woudl not set up any egress flows until we
activated the prot binding
in post live migrate instead of seting it up in pre-live-migrate like it
shoudl have.

landed in zed to add the neutron
supprothttps://review.opendev.org/c/openstack/neutron/+/828455
i dont recally the specific ovn version requried for that optimsation to
be present but i vaguly have
ovn 23? in my mind but that could be way off.

ovn also has other issues with how it work today under load
https://bugs.launchpad.net/neutron/+bug/2069718

there is a good youtube presentation from the sumit on how ovn can have
connectivy issues
in larger deployment ro where the reconsitation time for ovn is long
https://www.youtube.com/watch?v=POjSOxKyrE0

> it a workaroudn option so it was never requried
> it just help mitigate some downtime that could happen in older release
> of openstack
> it shoudl not be requried anymore.
>
> but is an unexpected sideeffect although it benine over all
>
> so what happening is when we invoke qemu to send thos RARP packets we
> must be waiting for that
> to complete before completing the migration.
>
> i htinki see the problem
>
> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L665-L667
>
>
> we are dreictly calling the qemu monitor via the libvirtmon module
> which isa cmodule.
> we are expectign that to yeild doe to eventlet but its possibel that
> that call is blocking.
>
> on one hand we proablly shoudl be kicking that off in a background
> tread and jsut movign on with the rest
> of post live migration, on the ohter hand this shoudl not really be
> need after antleop or bobcat ish.
> so im not sure useful it is to do that change now.
>
> anyway yes truning off the
> enable_qemu_monitor_announce_self workaround in 2025.1 should be safe.
>>
>> Nguyen Huu Khoi
>>
>> Nguyen Huu Khoi
>>
>>
>> On Wed, Nov 26, 2025 at 6:23 PM Sean Mooney <smooney@redhat.com> wrote:
>>> how did you "complete the migration" was it via the force-comlete
>>> action?
>>> that can take several minutes to complete at the qemu level as it need
>>> to transfer any outstanding memory or block data.
>>>
>>> also what are defining as the migration end time? when the vm on the
>>> dest starts? (if post copy live migration is used the migration is
>>> still
>>> in-progress when that happens
>>> when the vm on the source is stopped? the vm has not been cleaned up so
>>> the migration is not complete in nova at this point as we need to
>>> remove any
>>> volume attachments, delete local files and update neutron in post-live
>>> migrate before the migration is actually completed.
>>>
>>> it will not be updated to complete until all cleanup on the source host
>>> is also complete
>>> what you are reporting is not obviously indicative of a bug or error.
>>>
>>> without more information we cant really help you understand if this is
>>> normal or not.
>>>
>>> On 26/11/2025 01:18, Nguyễn Hữu Khôi wrote:
>>>> Hello.
>>>>
>>>> I am working on instance migration. After I complete the migration,
>>>> when I run *|openstack server migration list|* to check its status, it
>>>> still shows as /running/. It takes around *12 minutes* before it
>>>> updates to /completed /status.
>>>>
>>>> My OPS: 2025.1
>>>>
>>>> Thank you.
>>>>
>>>> Nguyen Huu Khoi
>