[Openstack-operators] [Openstack] KVM live block migration: stability, future, docs

Blair Bethwaite blair.bethwaite at gmail.com
Tue Aug 7 12:57:17 UTC 2012

Hi Sébastien,

Thanks for responding! By the way, I have come across your blog post
regarding this and should reference it for the list:

On 7 August 2012 17:45, Sébastien Han <han.sebastien at gmail.com> wrote:
> I think it's a pretty useful feature, a good compromise. As you said using a
> shared fs implies a lot of things and can dramatically decrease your
> performance rather than using the local fs.

Agreed, scale-out distributed file-systems are hard. Consistent
hashing based systems (like Gluster and Ceph) seem like the answer to
many of the existing issues with systems trying to mix scalability,
performance and POSIX compliance. But the key issue is how one
measures "performance" for these systems... throughput for large
synchronous reads & writes may scale linearly (up to network
saturation), but random IOPS are another thing entirely. As far as I
can tell, random IOPS are the primary metric of concern in the design
of the nova-compute storage, whereas both capacity and throughput
requirements are relatively easy to specify and simply represent hard
limits that must be met to support the various instance flavours you
plan to offer.

It's interesting to note that RedHat do not recommend using RHS
(RedHat Storage), their RHEL-based Gluster (which they own now)
appliance, for live VM storage.

Additionally, operations issues are much harder to handle with a DFS
(even NFS), e.g., how can I put an upper limit on disk I/O for any
particular instance when its ephemeral disk files are across the
network and potentially striped into opaque objects across multiple
storage bricks...?

> I tested it and I will use it
> for my deployment. I'll be happy to discuss more deeply with you about this
> feature :)

Great. We have tested too. Compared to regular (non-block) live
migrate, we don't see much difference in the guest - both scenarios
involve a minute or two of interruption as the guest is moved (e.g.
VNC and SSH sessions hang temporarily), which I find slightly
surprising - is that your experience too?

> I also feel a little concern about this statement:
>>  It don't work so well, it complicates migration code, and we are building
>> a replacement that works.
> I have to go further with my tests, maybe we could share some ideas, use
> case etc...

I think it may be worth asking about this on the KVM lists, unless
anyone here has further insights...?

I grabbed the KVM 1.0 source from Ubuntu Precise and vanilla KVM 1.1.1
from Sourceforge, block migration appears to remain in place despite
those (sparse) comments from the KVM meeting minutes (though I am
naive to the source layout and project structure, so could have easily
missed something). In any case, it seems unlikely Precise would see a
forced update to the 1.1.x series.


More information about the OpenStack-operators mailing list