[openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

Vladik Romanovsky vladik.romanovsky at enovance.com
Sat Jan 31 02:55:23 UTC 2015



----- Original Message -----
> From: "Daniel P. Berrange" <berrange at redhat.com>
> To: openstack-dev at lists.openstack.org, openstack-operators at lists.openstack.org
> Sent: Friday, 30 January, 2015 11:47:16 AM
> Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
> 
> In working on a recent Nova migration bug
> 
>   https://bugs.launchpad.net/nova/+bug/1414065
> 
> I had cause to refactor the way the nova libvirt driver monitors live
> migration completion/failure/progress. This refactor has opened the
> door for doing more intelligent active management of the live migration
> process.
> 
> As it stands today, we launch live migration, with a possible bandwidth
> limit applied and just pray that it succeeds eventually. It might take
> until the end of the universe and we'll happily wait that long. This is
> pretty dumb really and I think we really ought to do better. The problem
> is that I'm not really sure what "better" should mean, except for ensuring
> it doesn't run forever.
> 
> As a demo, I pushed a quick proof of concept showing how we could easily
> just abort live migration after say 10 minutes
> 
>   https://review.openstack.org/#/c/151665/
> 
> There are a number of possible things to consider though...
> 
> First how to detect when live migration isn't going to succeeed.
> 
>  - Could do a crude timeout, eg allow 10 minutes to succeeed or else.
> 
>  - Look at data transfer stats (memory transferred, memory remaining to
>    transfer, disk transferred, disk remaining to transfer) to determine
>    if it is making forward progress.

I think this is a better option. We could define a timeout for the progress
and cancel if there is no progress. IIRC there were similar debates about it
in Ovirt, we could do something similar:
https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430

> 
>  - Leave it upto the admin / user to decided if it has gone long enough
> 
> The first is easy, while the second is harder but probably more reliable
> and useful for users.
> 
> Second is a question of what todo when it looks to be failing
> 
>  - Cancel the migration - leave it running on source. Not good if the
>    admin is trying to evacuate a host.
> 
>  - Pause the VM - make it complete as non-live migration. Not good if
>    the guest workload doesn't like being paused
> 
>  - Increase the bandwidth permitted. There is a built-in rate limit in
>    QEMU overridable via nova.conf. Could argue that the admin should just
>    set their desired limit in nova.conf and be done with it, but perhaps
>    there's a case for increasing it in special circumstances. eg emergency
>    evacuate of host it is better to waste bandwidth & complete the job,
>    but for non-urgent scenarios better to limit bandwidth & accept failure ?
> 
>  - Increase the maximum downtime permitted. This is the small time window
>    when the guest switches from source to dest. To small and it'll never
>    switch, too large and it'll suffer unacceptable interuption.
> 

In my opinion, it would be great if we could play with bandwidth and downtime
before cancelling the migration or pausing.
However, It makes sense only if there is some kind of a progress in the transfer
stats and not a complete disconnect. In that case we should just cancel it.

> We could do some of these things automatically based on some policy
> or leave them upto the cloud admin/tenant user via new APIs
> 
> Third there's question of other QEMU features we could make use of to
> stop problems in the first place
> 
>  - Auto-converge flag - if you set this QEMU throttles back the CPUs
>    so the guest cannot dirty ram pages as quickly. This is nicer than
>    pausing CPUs altogether, but could still be an issue for guests
>    which have strong performance requirements
> 
>  - Page compression flag - if you set this QEMU does compression of
>    pages to reduce data that has to be sent. This is basically trading
>    off network bandwidth vs CPU burn. Probably a win unless you are
>    already highly overcomit on CPU on the host
> 
> Fourth there's a question of whether we should give the tenant user or
> cloud admin further APIs for influencing migration
> 
>  - Add an explicit API for cancelling migration ?
> 
>  - Add APIs for setting tunables like downtime, bandwidth on the fly ?
> 
>  - Or drive some of the tunables like downtime, bandwidth, or policies
>    like cancel vs paused from flavour or image metadata properties ?
> 
>  - Allow operations like evacuate to specify a live migration policy
>    eg switch non-live migrate after 5 minutes ?
> 
IMHO, an explicit API for cancelling migration is very much needed.
I remember cases when migrations took about 8 or hours, leaving the
admins helpless :)

Also, I very much like the idea of having tunables and policy to set
in the flavours and image properties.
To allow the administrators to set these as a "template" in the flavour
and also to let the users to update/override or "request" these options
as they should know the best (hopefully) what is running in their guests.
 


> The current code is so crude and there's a hell of alot of options we
> can take. I'm just not sure which is the best direction for us to go
> in.
> 
> What kind of things would be the biggest win from Operators' or tenants'
> POV ?

Thanks for writing this! :)

> 
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list