<div dir="ltr">Hi Daniel. Just a quick head's up that I'm very interested in this and taking a look. Thanks for taking an interest in this. I have seen l-m take 45-60 minutes (for lots of RAM, fairly active VMs.). I'll give more feedback this weekend when I have a chance to look at this in detail.</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 30, 2015 at 9:47 AM, Daniel P. Berrange <span dir="ltr"><<a href="mailto:berrange@redhat.com" target="_blank">berrange@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">In working on a recent Nova migration bug<br>

<br>

  <a href="https://bugs.launchpad.net/nova/+bug/1414065" target="_blank">https://bugs.launchpad.net/nova/+bug/1414065</a><br>

<br>

I had cause to refactor the way the nova libvirt driver monitors live<br>

migration completion/failure/progress. This refactor has opened the<br>

door for doing more intelligent active management of the live migration<br>

process.<br>

<br>

As it stands today, we launch live migration, with a possible bandwidth<br>

limit applied and just pray that it succeeds eventually. It might take<br>

until the end of the universe and we'll happily wait that long. This is<br>

pretty dumb really and I think we really ought to do better. The problem<br>

is that I'm not really sure what "better" should mean, except for ensuring<br>

it doesn't run forever.<br>

<br>

As a demo, I pushed a quick proof of concept showing how we could easily<br>

just abort live migration after say 10 minutes<br>

<br>

  <a href="https://review.openstack.org/#/c/151665/" target="_blank">https://review.openstack.org/#/c/151665/</a><br>

<br>

There are a number of possible things to consider though...<br>

<br>

First how to detect when live migration isn't going to succeeed.<br>

<br>

 - Could do a crude timeout, eg allow 10 minutes to succeeed or else.<br>

<br>

 - Look at data transfer stats (memory transferred, memory remaining to<br>

   transfer, disk transferred, disk remaining to transfer) to determine<br>

   if it is making forward progress.<br>

<br>

 - Leave it upto the admin / user to decided if it has gone long enough<br>

<br>

The first is easy, while the second is harder but probably more reliable<br>

and useful for users.<br>

<br>

Second is a question of what todo when it looks to be failing<br>

<br>

 - Cancel the migration - leave it running on source. Not good if the<br>

   admin is trying to evacuate a host.<br>

<br>

 - Pause the VM - make it complete as non-live migration. Not good if<br>

   the guest workload doesn't like being paused<br>

<br>

 - Increase the bandwidth permitted. There is a built-in rate limit in<br>

   QEMU overridable via nova.conf. Could argue that the admin should just<br>

   set their desired limit in nova.conf and be done with it, but perhaps<br>

   there's a case for increasing it in special circumstances. eg emergency<br>

   evacuate of host it is better to waste bandwidth & complete the job,<br>

   but for non-urgent scenarios better to limit bandwidth & accept failure ?<br>

<br>

 - Increase the maximum downtime permitted. This is the small time window<br>

   when the guest switches from source to dest. To small and it'll never<br>

   switch, too large and it'll suffer unacceptable interuption.<br>

<br>

We could do some of these things automatically based on some policy<br>

or leave them upto the cloud admin/tenant user via new APIs<br>

<br>

Third there's question of other QEMU features we could make use of to<br>

stop problems in the first place<br>

<br>

 - Auto-converge flag - if you set this QEMU throttles back the CPUs<br>

   so the guest cannot dirty ram pages as quickly. This is nicer than<br>

   pausing CPUs altogether, but could still be an issue for guests<br>

   which have strong performance requirements<br>

<br>

 - Page compression flag - if you set this QEMU does compression of<br>

   pages to reduce data that has to be sent. This is basically trading<br>

   off network bandwidth vs CPU burn. Probably a win unless you are<br>

   already highly overcomit on CPU on the host<br>

<br>

Fourth there's a question of whether we should give the tenant user or<br>

cloud admin further APIs for influencing migration<br>

<br>

 - Add an explicit API for cancelling migration ?<br>

<br>

 - Add APIs for setting tunables like downtime, bandwidth on the fly ?<br>

<br>

 - Or drive some of the tunables like downtime, bandwidth, or policies<br>

   like cancel vs paused from flavour or image metadata properties ?<br>

<br>

 - Allow operations like evacuate to specify a live migration policy<br>

   eg switch non-live migrate after 5 minutes ?<br>

<br>

The current code is so crude and there's a hell of alot of options we<br>

can take. I'm just not sure which is the best direction for us to go<br>

in.<br>

<br>

What kind of things would be the biggest win from Operators' or tenants'<br>

POV ?<br>

<br>

Regards,<br>

Daniel<br>

<span class="HOEnZb"><font color="#888888">--<br>

|: <a href="http://berrange.com" target="_blank">http://berrange.com</a>      -o-    <a href="http://www.flickr.com/photos/dberrange/" target="_blank">http://www.flickr.com/photos/dberrange/</a> :|<br>

|: <a href="http://libvirt.org" target="_blank">http://libvirt.org</a>              -o-             <a href="http://virt-manager.org" target="_blank">http://virt-manager.org</a> :|<br>

|: <a href="http://autobuild.org" target="_blank">http://autobuild.org</a>       -o-         <a href="http://search.cpan.org/~danberr/" target="_blank">http://search.cpan.org/~danberr/</a> :|<br>

|: <a href="http://entangle-photo.org" target="_blank">http://entangle-photo.org</a>       -o-       <a href="http://live.gnome.org/gtk-vnc" target="_blank">http://live.gnome.org/gtk-vnc</a> :|<br>

<br>

_______________________________________________<br>

OpenStack-operators mailing list<br>

<a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

</font></span></blockquote></div><br></div>