[openstack-dev] [nova] live migration in Mitaka

Chris Friesen chris.friesen at windriver.com
Tue Sep 22 15:29:46 UTC 2015


Apologies for the indirect quote, some of the earlier posts got deleted before I 
noticed the thread.

On 09/21/2015 03:43 AM, Koniszewski, Pawel wrote:
>> -----Original Message-----
>> From: Daniel P. Berrange [mailto:berrange at redhat.com]

>> There was a proposal to nova to allow the 'pause' operation to be invoked
>> while migration was happening. This would turn a live migration into a
>> coma-migration, thereby ensuring it succeeds. I cna't remember if this
>> merged or not, as i can't find the review offhand, but its important to
>> have this ASAP IMHO, as when evacuating VMs from a host admins need a knob
>> to use to force successful evacuation, even at the cost of pausing the
>> guest temporarily.

It's not strictly "live" migration, but for the same reason of pushing VMs off a 
host for maintenance it would be nice to have some way of migrating suspended 
instances.  (As brought up in 
http://lists.openstack.org/pipermail/openstack-dev/2015-September/075042.html)

>> In libvirt upstream we now have the ability to filter what disks are
>> migrated during block migration. We need to leverage that new feature to
>> fix the long standing problems of block migration when non-local images are
>> attached - eg cinder volumes. We definitely want this in Mitaka.

Agreed, this would be a very useful addition.

>> We should look at what we need to do to isolate the migration data network
>> from the main management network. Currently we live migrate over whatever
>> network is associated with the compute hosts primary Hostname / IP address.
>> This is not neccessarily the fastest NIC on the host. We ought to be able
>> to record an alternative hostname / IP address against each compute host to
>> indicate the desired migration interface.

Yes, this would be good to have upstream.  We've added this sort of thing 
locally (though with a hardcoded naming scheme) to allow migration over 10G 
links with management over 1G links.

>> There is also work on post-copy migration in QEMU. Normally with live
>> migration, the guest doesn't start executing on the target host until
>> migration has transferred all data. There are many workloads where that
>> doesn't work, as the guest is dirtying data too quickly, With post-copy you
>> can start running the guest on the target at any time, and when it faults
>> on a missing page that will be pulled from the source host. This is
>> slightly more fragile as you risk loosing the guest entirely if the source
>> host dies before migration finally completes. It does guarantee that
>> migration will succeed no matter what workload is in the guest. This is
>> probably Nxxxx cycle material.

It seems to me that the ideal solution would be to start doing pre-copy 
migration, then if that doesn't converge with the specified downtime value then 
maybe have the option to just cut over to the destination and do a post-copy 
migration of the remaining data.

Chris



More information about the OpenStack-dev mailing list