[Openstack-operators] [nova] Live migration performance tests on 100 compute nodes

Koniszewski, Pawel pawel.koniszewski at intel.com
Fri Dec 30 12:56:23 UTC 2016


This was a bandwidth issue. Nova kept being connected to the broker, but it started to timeout on RPC messages, e.g., we lost some RPC messages triggering post live migration steps which are there to update nova DB to reflect new host of an instance.

The good workaround for this issue is to slightly limit bandwidth used for live migrations through nova.conf [1], live_migration_bandwidth config option in [libvirt] section. By default it is set to 0, so, basically, it is unlimited. Also please be aware that we changed default live migration configuration in OpenStack Newton and tunneling is now off by default (live_migration_tunnelled in [libvirt] section is set to False) due to huge performance impact.

[1] http://docs.openstack.org/newton/config-reference/compute/config-options.html

Kind Regards,
Pawel Koniszewski

From: tadowguy at gmail.com [mailto:tadowguy at gmail.com] On Behalf Of Matt Fischer
Sent: Friday, December 30, 2016 5:31 AM
To: Koniszewski, Pawel <pawel.koniszewski at intel.com>
Cc: openstack-operators at lists.openstack.org
Subject: Re: [Openstack-operators] [nova] Live migration performance tests on 100 compute nodes

On Wed, Dec 28, 2016 at 6:11 AM, Koniszewski, Pawel <pawel.koniszewski at intel.com<mailto:pawel.koniszewski at intel.com>> wrote:
Hello everyone,

We made a research to see how live migration performance varies between different configurations, especially we aimed to test tunneled vs non-tunneled live migrations. To test live migration we simulated a case of 0-day patching of 100 compute nodes (including reboot) with workloads that are close to the real world workloads. All the results were published [1] along with environment configuration and how we built test framework. Hope you find this useful.

[1] https://01.org/openstack/blogs/pkoniszewski/2016/ossc-zero-day-patching

Kind Regards,
Pawel Koniszewski

Thanks for the write-up. I'm curious about your RabbitMQ connection failures. Was it nova-compute failing to connect? Was it a bandwidth or heartbeat issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161230/f103881f/attachment.html>

More information about the OpenStack-operators mailing list