[nova]Guidance on Instance Migration Issues in Production
Hi All, I’m facing some issues with instance migration in production and need advice: Migrations are taking longer than expected—any tips to handle delays? When the host is fully utilized or overcommit ratio is high, migration sometimes fails with "valid host not found". Are there other common causes? Can we use a separate network interface for migration in OpenStack, and is it recommended? Any guidance or references would be appreciated. Thanks, Thamanna Farhath N https://www.facebook.com/Zybisys/ Disclaimer : The content of this email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error, please notify the sender and remove the messages from your system. If you are not the named addressee, it is strictly forbidden for you to share, circulate, distribute or copy any part of this e-mail to any third party without the written consent of the sender. E-mail transmission cannot be guaranteed to be secured or error free as information could be intercepted, corrupted, lost, destroyed, arrive late, incomplete, or may contain viruses. Therefore, we do not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. The recipient should check this e-mail and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email."
On 11/09/2025 08:42, Thamanna Farhath wrote:
Hi All,
I’m facing some issues with instance migration in production and need advice:
1.
Migrations are taking longer than expected—any tips to handle delays?
you can force the migration to complete via the api. this will pause the guest and force the memory copy then un-pause once complete on the destination. https://docs.openstack.org/api-ref/compute/#force-migration-complete-action-... in terms of recommendation you should generally use the default for the number of parallel migrations which is 1. i.e. do not allow nova to start more then one live migration at a time. this is because you want to race till done when it comes to live migration rather then run many in parallel. i also generally recommend enabling post_copy live migration or failing that auto converge. post_copy is effectively required for huge-page backed guest certainly if you use 1G huge-pages it its good to consider for any workload with high memory usage.
1.
When the host is fully utilized or overcommit ratio is high, migration sometimes fails with |"valid host not found"|. Are there other common causes?
if you are specifying a its possible that the destination would violate one of the requirement enforce by the scheduler filters if your allowing nova to choose, which should be your default, the it implies again that a scheduler constraint would be violated like a server group with the hard affinity policy.
1.
Can we use a separate network interface for migration in OpenStack, and is it recommended?
yes you can and yes you should unless you default route is at least 10Gbs by default nova will live migrate over your default route via what is typically the management interface. if that is only 1G it is recommened to configure https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.liv... on each hsot to the ip adress of a faster interface. it does not need to be a dedicated interface, it can be shared with neutron or storage traffic but for timely migration you shoudl use at least a 10G interface and the faster the better up to a point. nova currently only support single threaded live migration. there is some work to enabel multi threaded live migration happening upstream. so the live mgiration speed is currently limited by the single threaded performace fo the host cpu core. that means that beyond 10-25G there is little benefit for live migration today. counter intuitively latency tends to be lower over 25G lings then 40G nics because a 40G nic is really just 4 10G connections.
Any guidance or references would be appreciated.
Thanks, Thamanna Farhath N
------------------------------------------------------------------------ *Disclaimer :*/ The content of this email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error, please notify the sender and remove the messages from your system. If you are not the named addressee, it is strictly forbidden for you to share, circulate, distribute or copy any part of this e-mail to any third party without the written consent of the sender.// /
/// /
/E-mail transmission cannot be guaranteed to be secured or error free as information could be intercepted, corrupted, lost, destroyed, arrive late, incomplete, or may contain viruses. Therefore, we do not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. The recipient should check this e-mail and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email."// /
participants (2)
-
Sean Mooney
-
Thamanna Farhath