[openstack-dev] [nova] [RFC] how to enable xbzrle compress for live migration

John Garbutt john at johngarbutt.com
Thu Nov 26 17:47:35 UTC 2015


On 26 November 2015 at 17:39, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote:
>> Hi all,
>> We want to support xbzrle compress for live migration.
>>
>> Now there are 3 options,
>> 1. add the enable flag in nova.conf.
>>     such as a dedicated 'live_migration_compression=on|off" parameter in
>> nova.conf.
>>     And nova simply enable it.
>>     seems not good.
>
> Just having a live_migration_compression=on|off parameter that
> unconditionally turns it on for all VMs is not really a solution
> on its own, as it leaves out the problem of compression cache
> memory size, which is at the root of the design problem.
>
> Without sensible choice of the cache size, the compression is
> either useless (to small and it won't get a useful number cache
> hits and so won't save any data transfer bandwidth) or it is
> hugely wasteful of resources (to large and you're just sucking
> host RAM for no benefit). QEMU migration code maintainers
> guidelines are that the cache size should be approximately
> equal to the guest RAM working set. IOW for a 4 GB guest
> you potentially need a 4 GB cache for migration, so we're
> doubling the memory usage of a guest, without the schedular
> being any the wiser, which will inevitably cause the host
> to die in out of memory at some point.
>
>
>> 2.  add a parameters in live migration API.
>>
>> A new array compress will be added as optional, the json-schema as below::
>>
>>   {
>>     'type': 'object',
>>     'properties': {
>>       'os-migrateLive': {
>>         'type': 'object',
>>         'properties': {
>>           'block_migration': parameter_types.boolean,
>>           'disk_over_commit': parameter_types.boolean,
>>           'compress': {
>>             'type': 'array',
>>             'items': ["xbzrle"],
>>           },
>>           'host': host
>>         },
>>         'additionalProperties': False,
>>       },
>>     },
>>     'required': ['os-migrateLive'],
>>     'additionalProperties': False,
>>   }
>
> I really don't think we want to expose this kind of hypervisor
> specific detail in the live migration API of Nova. It just leaks
> too many low level details. It still leaves the problem of deciding
> the compression cache size unsolved and likewise the problem of the
> schedular knowing about the memory usage for this cache in order to
> avoid OOM

+1

>> 3.  dynamically choose when to activate xbzrle compress for live migration.
>>      This is the best.
>>      xbzrle really wants to be used if the network is not able to keep up
>> with the dirtying rate of the guest RAM.
>>      But how do I check the coming migration fit this situation?
>
> FWIW, if we decide we want compression support in Nova, I think that
> having the Nova libvirt driver dynamically decide when to use it is
> the only viable approach. Unfortunately the way the QEMU support
> is implemented makes it very hard to use, as QEMU forces you to decide
> to use it upfront, at a time when you don't have any useful information
> on which to make the decision :-(  To be useful IMHO, we really need
> the ability to turn on compression on the fly for an existing active
> migration process. ie, we'd start migration off and let it run and
> only enable compression if we encounter problems with completion.
> Sadly we can't do this with QEMU as it stands today :-(
>
> Oh and of course we still need to address the issue of RAM usage and
> communicating that need with the scheduler in order to avoid OOM
> scenarios due to large compression cache.
>
> I tend to feel that the QEMU compression code is currently broken by
> design and needs rework in QEMU before it can be pratically used in
> an autonomous fashion :-(

Honestly, most of the conversations seems to be leading that way.

It does seem a nice alternative to throttling the VM performance when
trying to get the memory transfer to complete. But as you say, seems
we can't use it in that way right now.

johnthetubaguy



More information about the OpenStack-dev mailing list