[Openstack-operators] [nova][libvirt] RFC: ensuring live migration ends

Daniel P. Berrange berrange at redhat.com
Fri Jan 30 16:47:16 UTC 2015


In working on a recent Nova migration bug

  https://bugs.launchpad.net/nova/+bug/1414065

I had cause to refactor the way the nova libvirt driver monitors live
migration completion/failure/progress. This refactor has opened the
door for doing more intelligent active management of the live migration
process.

As it stands today, we launch live migration, with a possible bandwidth
limit applied and just pray that it succeeds eventually. It might take
until the end of the universe and we'll happily wait that long. This is
pretty dumb really and I think we really ought to do better. The problem
is that I'm not really sure what "better" should mean, except for ensuring
it doesn't run forever.

As a demo, I pushed a quick proof of concept showing how we could easily
just abort live migration after say 10 minutes

  https://review.openstack.org/#/c/151665/

There are a number of possible things to consider though...

First how to detect when live migration isn't going to succeeed.

 - Could do a crude timeout, eg allow 10 minutes to succeeed or else.

 - Look at data transfer stats (memory transferred, memory remaining to
   transfer, disk transferred, disk remaining to transfer) to determine
   if it is making forward progress.

 - Leave it upto the admin / user to decided if it has gone long enough

The first is easy, while the second is harder but probably more reliable
and useful for users.

Second is a question of what todo when it looks to be failing

 - Cancel the migration - leave it running on source. Not good if the
   admin is trying to evacuate a host.

 - Pause the VM - make it complete as non-live migration. Not good if
   the guest workload doesn't like being paused

 - Increase the bandwidth permitted. There is a built-in rate limit in
   QEMU overridable via nova.conf. Could argue that the admin should just
   set their desired limit in nova.conf and be done with it, but perhaps
   there's a case for increasing it in special circumstances. eg emergency
   evacuate of host it is better to waste bandwidth & complete the job,
   but for non-urgent scenarios better to limit bandwidth & accept failure ?

 - Increase the maximum downtime permitted. This is the small time window
   when the guest switches from source to dest. To small and it'll never
   switch, too large and it'll suffer unacceptable interuption.

We could do some of these things automatically based on some policy
or leave them upto the cloud admin/tenant user via new APIs

Third there's question of other QEMU features we could make use of to
stop problems in the first place

 - Auto-converge flag - if you set this QEMU throttles back the CPUs
   so the guest cannot dirty ram pages as quickly. This is nicer than
   pausing CPUs altogether, but could still be an issue for guests
   which have strong performance requirements

 - Page compression flag - if you set this QEMU does compression of
   pages to reduce data that has to be sent. This is basically trading
   off network bandwidth vs CPU burn. Probably a win unless you are
   already highly overcomit on CPU on the host

Fourth there's a question of whether we should give the tenant user or
cloud admin further APIs for influencing migration

 - Add an explicit API for cancelling migration ?

 - Add APIs for setting tunables like downtime, bandwidth on the fly ?

 - Or drive some of the tunables like downtime, bandwidth, or policies
   like cancel vs paused from flavour or image metadata properties ?

 - Allow operations like evacuate to specify a live migration policy
   eg switch non-live migrate after 5 minutes ?

The current code is so crude and there's a hell of alot of options we
can take. I'm just not sure which is the best direction for us to go
in.

What kind of things would be the biggest win from Operators' or tenants'
POV ?

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-operators mailing list