[openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

Koniszewski, Pawel pawel.koniszewski at intel.com
Fri Nov 27 12:17:06 UTC 2015


> -----Original Message-----
> From: Daniel P. Berrange [mailto:berrange at redhat.com]
> Sent: Friday, November 27, 2015 12:50 PM
> To: 少合冯
> Cc: Feng, Shaohe; OpenStack Development Mailing List (not for usage
> questions); Xiao, Guangrong; Ding, Jian-feng; Dong, Eddie; Wang, Yong Y; Jin,
> Yuntong
> Subject: Re: [openstack-dev] [nova] [RFC] how to enable xbzrle compress for
> live migration
> 
> On Fri, Nov 27, 2015 at 07:37:50PM +0800, 少合冯 wrote:
> > 2015-11-27 2:19 GMT+08:00 Daniel P. Berrange <berrange at redhat.com>:
> >
> > > On Thu, Nov 26, 2015 at 05:39:04PM +0000, Daniel P. Berrange wrote:
> > > > On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
> > > > > 3.  dynamically choose when to activate xbzrle compress for live
> > > migration.
> > > > >      This is the best.
> > > > >      xbzrle really wants to be used if the network is not able
> > > > > to keep
> > > up
> > > > > with the dirtying rate of the guest RAM.
> > > > >      But how do I check the coming migration fit this situation?
> > > >
> > > > FWIW, if we decide we want compression support in Nova, I think
> > > > that having the Nova libvirt driver dynamically decide when to use
> > > > it is the only viable approach. Unfortunately the way the QEMU
> > > > support is implemented makes it very hard to use, as QEMU forces
> > > > you to decide to use it upfront, at a time when you don't have any
> > > > useful information on which to make the decision :-(  To be useful
> > > > IMHO, we really need the ability to turn on compression on the fly
> > > > for an existing active migration process. ie, we'd start migration
> > > > off and let it run and only enable compression if we encounter
> problems with completion.
> > > > Sadly we can't do this with QEMU as it stands today :-(
> > > >
> > >
> > [Shaohe Feng]
> > Add more guys working on kernel/hypervisor in our loop.
> > Wonder whether there will be any good solutions to improve it in QEMU
> > in future.
> >
> >
> > > > Oh and of course we still need to address the issue of RAM usage
> > > > and communicating that need with the scheduler in order to avoid
> > > > OOM scenarios due to large compression cache.
> > > >
> > > > I tend to feel that the QEMU compression code is currently broken
> > > > by design and needs rework in QEMU before it can be pratically
> > > > used in an autonomous fashion :-(
> > >
> > > Actually thinking about it, there's not really any significant
> > > difference between Option 1 and Option 3. In both cases we want a
> > > nova.conf setting live_migration_compression=on|off to control
> > > whether we want to *permit* use  of compression.
> > >
> > > The only real difference between 1 & 3 is whether migration has
> > > compression enabled always, or whether we turn it on part way though
> > > migration.
> > >
> > > So although option 3 is our desired approach (which we can't
> > > actually implement due to QEMU limitations), option 1 could be made
> > > fairly similar if we start off with a very small compression cache
> > > size which would have the effect of more or less disabling
> > > compression initially.
> > >
> > > We already have logic in the code for dynamically increasing the max
> > > downtime value, which we could mirror here
> > >
> > > eg something like
> > >
> > >  live_migration_compression=on|off
> > >
> > >   - Whether to enable use of compression
> > >
> > >  live_migration_compression_cache_ratio=0.8
> > >
> > >   - The maximum size of the compression cache relative to
> > >     the guest RAM size. Must be less than 1.0
> > >
> > >  live_migration_compression_cache_steps=10
> > >
> > >   - The number of steps to take to get from initial cache
> > >     size to the maximum cache size
> > >
> > >  live_migration_compression_cache_delay=75
> > >
> > >   - The time delay in seconds between increases in cache
> > >     size
> > >
> > >
> > > In the same way that we do with migration downtime, instead of
> > > increasing cache size linearly, we'd increase it in ever larger
> > > steps until we hit the maximum. So we'd start off fairly small a few
> > > MB, and monitoring the cache hit rates, we'd increase it
> > > periodically.  If the number of steps configured and time delay
> > > between steps are reasonably large, that would have the effect that
> > > most migrations would have a fairly small cache and would complete
> > > without needing much compression overhead.
> > >
> > > Doing this though, we still need a solution to the host OOM scenario
> > > problem. We can't simply check free RAM at start of migration and
> > > see if there's enough to spare for compression cache, as the
> > > schedular can spawn a new guest on the compute host at any time,
> > > pushing us into OOM. We really need some way to indicate that there
> > > is a (potentially very large) extra RAM overhead for the guest during
> migration.

What about CPU? We might end up with live migration that degrades performance of other VMs on source and/or destination node. AFAIK CPUs are heavily oversubscribed in many cases and this does not help. I'm not sure that this thing fits into Nova as it requires resource monitoring.

> > > ie if live_migration_compression_cache_ratio is 0.8 and we have a
> > > 4 GB guest, we need to make sure the schedular knows that we are
> > > potentially going to be using 7.2 GB of memory during migration
> > >
> > >
> > [Shaohe Feng]
> > These suggestions sounds good.
> > Thank you, Daneil.
> >
> > Do we need to consider this factor:
> >   Seems, XBZRLE compress is executed after bulk stage. During the bulk
> > stage,
> >   calculate an transfer rate. If the transfer rate bellow a certain
> >   threshold value, we can set a bigger cache size.
> 
> I think it is probably sufficient to just look at the xbzrle cache hit rates every
> "live_migration_compression_cache_delay" seconds and decide how to tune
> the cache size based on that.
> 
> 
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
> 
> __________________________________________________________
> ________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6499 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151127/e98fb923/attachment.bin>


More information about the OpenStack-dev mailing list