[openstack-dev] [nova] post-copy live migration

Daniel P. Berrange berrange at redhat.com
Tue Apr 5 15:33:28 UTC 2016


On Tue, Apr 05, 2016 at 05:17:41PM +0200, Luis Tomas wrote:
> Hi,
> 
> We are working on the possibility of including post-copy live migration into
> Nova (https://review.openstack.org/#/c/301509/)
> 
> At libvirt level, post-copy live migration works as follow:
>     - Start live migration with a post-copy enabler flag
> (VIR_MIGRATE_POSTCOPY). Note this does not mean the migration is performed
> in post-copy mode, just that you can switch it to post-copy at any given
> time.
>     - Change the migration from pre-copy to post-copy mode.
> 
> However, we are not sure what's the most convenient way of providing this
> functionality at Nova level.
> The current specs, propose to include an optional flag at the live migration
> API to include the VIR_MIGRATE_POSTCOPY flag when starting the live
> migration. Then we propose a second API to actually switch the migration
> from pre-copy to post-copy mode similarly to how it is done in LibVirt. This
> is also similar to how the new "force-migrate" option works to ensure
> migrations completion. In fact, this method could be an extension of the
> force-migrate, by switching to postcopy if the migration was started with
> the VIR_MIGRATE_POSTCOPY libvirt flag, or pause it otherwise.
> 
> The cons of this approach are that we expose a too specific mechanism
> through the API. To alleviate this, we could remove the "switch" API, and
> automatize the switch based on data transferred, available bandwidth or
> other related metrics. However we will still need the extension to the
> live-migration API to include the proper libvirt postcopy flag.

No we absolutely don't want to expose that in the API as a concept, as it
is private technical implementation detail of the KVM migration code.

> The other solution is to start all the migrations with the
> VIR_MIGRATE_POSTCOPY mode, and therefore no new APIs would be needed. The
> system could automatically detect the migration is taking too long (or is
> dirting memory faster than the sending rate), and automatically switch to
> post-copy.

Yes this is what we should be doing as default behaviour with new enough
QEMU IMHO.

> The cons of this is that including the VIR_MIGRATE_POSTCOPY flag has an
> overhead, and it will not be desirable to included for all migrations,
> specially is they can be nicely migrated with pre-copy mode. In addition, if
> the migration fails after the switching, the VM will be lost. Therefore,
> admins may want to ensure that post-copy is not used for some specific VMs.

We shouldn't be trying to run before we can walk. Even if post-copy
is hurts some guests, it'll still be a net win overall because it will
give a guarantee that migration can complete without needing to stop
guest CPUs entirely. All we need to start with is a nova.conf setting
to let admin turn off use of post-copy for the host for cases where
we want to priortize performance over the ability to migrate successfully.

Any plan wrt changing migration behaviour on a per-VM basis needs to
consider a much broader set of features than just post-copy. For example,
compression, autoconverge and max-downtime settings all have an overhead
or impact on the guest too. We don't want to end up exposing API flags to
turn any of these on/off individually. So any solution to this will have
to look at a combination of usage context and some kind of SLA marker on
the guest. eg if the migration is in the context of host-evacuate which
absolutely must always complete in finite time, we should always use
post-copy. If the migration is in the context of load-balancing workloads
across hosts, then some aspect of guest SLA must inform whether Nova chooses
to use post-copy, or compression or auto-converge, etc.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list