Open Stack

Tue Aug 7 20:13:22 UTC 2012

On 08/07/2012 08:57 AM, Blair Bethwaite wrote:
> Hi Sébastien,
> 
> Thanks for responding! By the way, I have come across your blog post
> regarding this and should reference it for the list:
> http://www.sebastien-han.fr/blog/2012/07/12/openstack-block-migration/
> 
> On 7 August 2012 17:45, Sébastien Han <han.sebastien at gmail.com> wrote:
>> I think it's a pretty useful feature, a good compromise. As you said using a
>> shared fs implies a lot of things and can dramatically decrease your
>> performance rather than using the local fs.
> 
> Agreed, scale-out distributed file-systems are hard. Consistent
> hashing based systems (like Gluster and Ceph) seem like the answer to
> many of the existing issues with systems trying to mix scalability,
> performance and POSIX compliance. But the key issue is how one
> measures "performance" for these systems... throughput for large
> synchronous reads & writes may scale linearly (up to network
> saturation), but random IOPS are another thing entirely. As far as I
> can tell, random IOPS are the primary metric of concern in the design
> of the nova-compute storage, whereas both capacity and throughput
> requirements are relatively easy to specify and simply represent hard
> limits that must be met to support the various instance flavours you
> plan to offer.
> 
> It's interesting to note that RedHat do not recommend using RHS
> (RedHat Storage), their RHEL-based Gluster (which they own now)
> appliance, for live VM storage.
> 
> Additionally, operations issues are much harder to handle with a DFS
> (even NFS), e.g., how can I put an upper limit on disk I/O for any
> particular instance when its ephemeral disk files are across the
> network and potentially striped into opaque objects across multiple
> storage bricks...?

We at AT&T are also interested in this area, for the record, and will
likely do testing in this area in the next 6-12 months. We will release
any information and findings to the mailing list of course, and
hopefully we can collaborate on this important area.

>> I tested it and I will use it
>> for my deployment. I'll be happy to discuss more deeply with you about this
>> feature :)
> 
> Great. We have tested too. Compared to regular (non-block) live
> migrate, we don't see much difference in the guest - both scenarios
> involve a minute or two of interruption as the guest is moved (e.g.
> VNC and SSH sessions hang temporarily), which I find slightly
> surprising - is that your experience too?

Why would you find this surprising? I'm just curious...

>> I also feel a little concern about this statement:
>>
>>>  It don't work so well, it complicates migration code, and we are building
>>> a replacement that works.
>>
>>
>> I have to go further with my tests, maybe we could share some ideas, use
>> case etc...
> 
> I think it may be worth asking about this on the KVM lists, unless
> anyone here has further insights...?
> 
> I grabbed the KVM 1.0 source from Ubuntu Precise and vanilla KVM 1.1.1
> from Sourceforge, block migration appears to remain in place despite
> those (sparse) comments from the KVM meeting minutes (though I am
> naive to the source layout and project structure, so could have easily
> missed something). In any case, it seems unlikely Precise would see a
> forced update to the 1.1.x series.

cc'd Daniel Berrange, who seems to be keyed in on upstream KVM/Qemu
activity. Perhaps Daniel could shed some light.

Best,
-jay

Open Stack

[Openstack] KVM live block migration: stability, future, docs

OpenStack

Community

Documentation

Branding & Legal