[openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

Koniszewski, Pawel pawel.koniszewski at intel.com
Mon Nov 30 11:09:32 UTC 2015


> -----Original Message-----
> From: Murray, Paul (HP Cloud) [mailto:pmurray at hpe.com]
> Sent: Friday, November 27, 2015 4:29 PM
> To: Daniel P. Berrange; Carlton, Paul (Cloud Services)
> Cc: 少合冯; OpenStack Development Mailing List (not for usage questions);
> John Garbutt; Koniszewski, Pawel; Jin, Yuntong; Feng, Shaohe; Qiao, Liyong
> Subject: RE: [nova] [RFC] how to enable xbzrle compress for live migration
>
>
>
> > -----Original Message-----
> > From: Daniel P. Berrange [mailto:berrange at redhat.com]
> > Sent: 26 November 2015 17:58
> > To: Carlton, Paul (Cloud Services)
> > Cc: 少合冯; OpenStack Development Mailing List (not for usage
> questions);
> > John Garbutt; pawel.koniszewski at intel.com; yuntong.jin at intel.com;
> > shaohe.feng at intel.com; Murray, Paul (HP Cloud); liyong.qiao at intel.com
> > Subject: Re: [nova] [RFC] how to enable xbzrle compress for live
> > migration
> >
> > On Thu, Nov 26, 2015 at 05:49:50PM +0000, Paul Carlton wrote:
> > > Seems to me the prevailing view is that we should get live migration
> > > to figure out the best setting for itself where possible.  There was
> > > discussion of being able have a default policy setting that will
> > > allow the operator to define balance between speed of migration and
> > > impact on the instance.  This could be a global default for the
> > > cloud with overriding defaults per aggregate, image, tenant and
> > > instance as well as the ability to vary the setting during the
migration
> operation.
> > >
> > > Seems to me that items like compression should be set in
> > > configuration files based on what works best given the cloud
operator's
> environment?
> >
> > Merely turning on use of compression is the "easy" bit - there needs
> > to be a way to deal with compression cache size allocation, which
> > needs to have some smarts in Nova, as there's no usable "one size fits
> > all" value for the compression cache size. If we did want to hardcode
> > a compression cache size, you'd have to pick set it as a scaling factor
against
> the guest RAM size.
> > This is going to be very heavy on memory usage, so there needs careful
> > design work to solve the problem of migration compression triggering
> > host OOM scenarios, particularly since we can have multiple concurrent
> > migrations.
> >
>
>
> Use cases for live migration generally fall into two types:
>
> 1. I need to empty the host (host maintenance/reboot)
>
> 2. I generally want to balance load on the cloud
>
> The first case is by far the most common need right now and in that case
the
> node gets progressively more empty as VMs are moved off. So the resources
> available for caching etc. grow as the process goes on.
>I'd rather say that these resources might shrink. You need to turn off one
compute node, stack more VMs on remaining compute nodes and you need to
allocate cache on both sides, source and destination.

>why do we need on destination?
>XBZRLE sends only a delta over network and it works in two phases:
>compressing and decompressing. During compression the original page and
>updated page are XORed together and resulting information is passed over to
>the RLE algorithm - the output is the delta page which is sent over network
>to destination host. During decompression run length decodes each pair of
>symbol-counter and the original page is XORed with the result from the run
>length decoding - the output is the updated page. It means that it needs to
>allocate cache on source and destination node.

>But I think the RAM on the destination is the  original  page . Just decompression 
>with the delta.

>It does not need extra cache.

Ah, got it. Misunderstood conception - destination host is updated in every iteration so no need to store a cache there.

Then we need something smart in nova to decide whether xbzrle can be turned on.  

> The second case is less likely to be urgent from the operators point of
view,
> so doing things more slowly may not be a problem.
>
> So looking at how much resource is available at the start of a migration
and
> deciding then what to do on a per VM basis is probably not a bad idea.
> Especially if we can differentiate between the two cases.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6499 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151130/b0bc411e/attachment.bin>


More information about the OpenStack-dev mailing list