[openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

Daniel P. Berrange berrange at redhat.com
Thu Nov 26 18:19:34 UTC 2015


On Thu, Nov 26, 2015 at 05:39:04PM +0000, Daniel P. Berrange wrote:
> On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
> > 3.  dynamically choose when to activate xbzrle compress for live migration.
> >      This is the best.
> >      xbzrle really wants to be used if the network is not able to keep up
> > with the dirtying rate of the guest RAM.
> >      But how do I check the coming migration fit this situation?
> 
> FWIW, if we decide we want compression support in Nova, I think that
> having the Nova libvirt driver dynamically decide when to use it is
> the only viable approach. Unfortunately the way the QEMU support
> is implemented makes it very hard to use, as QEMU forces you to decide
> to use it upfront, at a time when you don't have any useful information
> on which to make the decision :-(  To be useful IMHO, we really need
> the ability to turn on compression on the fly for an existing active
> migration process. ie, we'd start migration off and let it run and
> only enable compression if we encounter problems with completion.
> Sadly we can't do this with QEMU as it stands today :-(
> 
> Oh and of course we still need to address the issue of RAM usage and
> communicating that need with the scheduler in order to avoid OOM
> scenarios due to large compression cache.
> 
> I tend to feel that the QEMU compression code is currently broken by
> design and needs rework in QEMU before it can be pratically used in
> an autonomous fashion :-(

Actually thinking about it, there's not really any significant
difference between Option 1 and Option 3. In both cases we want
a nova.conf setting live_migration_compression=on|off to control
whether we want to *permit* use  of compression.

The only real difference between 1 & 3 is whether migration has
compression enabled always, or whether we turn it on part way
though migration.

So although option 3 is our desired approach (which we can't
actually implement due to QEMU limitations), option 1 could
be made fairly similar if we start off with a very small
compression cache size which would have the effect of more or
less disabling compression initially.

We already have logic in the code for dynamically increasing
the max downtime value, which we could mirror here

eg something like

 live_migration_compression=on|off

  - Whether to enable use of compression

 live_migration_compression_cache_ratio=0.8

  - The maximum size of the compression cache relative to
    the guest RAM size. Must be less than 1.0

 live_migration_compression_cache_steps=10

  - The number of steps to take to get from initial cache
    size to the maximum cache size

 live_migration_compression_cache_delay=75

  - The time delay in seconds between increases in cache
    size


In the same way that we do with migration downtime, instead of
increasing cache size linearly, we'd increase it in ever larger
steps until we hit the maximum. So we'd start off fairly small
a few MB, and monitoring the cache hit rates, we'd increase it
periodically.  If the number of steps configured and time delay
between steps are reasonably large, that would have the effect
that most migrations would have a fairly small cache and would
complete without needing much compression overhead.

Doing this though, we still need a solution to the host OOM scenario
problem. We can't simply check free RAM at start of migration and
see if there's enough to spare for compression cache, as the schedular
can spawn a new guest on the compute host at any time, pushing us into
OOM. We really need some way to indicate that there is a (potentially
very large) extra RAM overhead for the guest during migration.

ie if live_migration_compression_cache_ratio is 0.8 and we have a
4 GB guest, we need to make sure the schedular knows that we are
potentially going to be using 7.2 GB of memory during migration

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list