[Openstack-operators] RAID / stripe block storage volumes

Robert Starmer robert at kumul.us
Tue Feb 9 01:15:02 UTC 2016


I have not run into anyone replicating volumes or creating redundancy at
the VM level (beyond, as you point out, HDFS, etc.).

R

On Mon, Feb 8, 2016 at 6:54 PM, Joe Topjian <joe at topjian.net> wrote:

> This is a great conversation and I really appreciate everyone's input.
> Though, I agree, we wandered off the original question and that's my fault
> for mentioning various storage backends.
>
> For the sake of conversation, let's just say the user has no knowledge of
> the underlying storage technology. They're presented with a Block Storage
> service and the rest is up to them. What known, working options does the
> user have to build their own block storage resilience? (Ignoring "obvious"
> solutions where the application has native replication, such as Galera,
> elasticsearch, etc)
>
> I have seen references to Cinder supporting replication, but I'm not able
> to find a lot of information about it. The support matrix[1] lists very few
> drivers that actually implement replication -- is this true or is there a
> trove of replication docs that I just haven't been able to find?
>
> Amazon AWS publishes instructions on how to use mdadm with EBS[2]. One
> might interpret that to mean mdadm is a supported solution within EC2 based
> instances.
>
> There are also references to DRBD and EC2, though I could not find
> anything as "official" as mdadm and EC2.
>
> Does anyone have experience (or know users) doing either? (specifically
> with libvirt/KVM, but I'd be curious to know in general)
>
> Or is it more advisable to create multiple instances where data is
> replicated instance-to-instance rather than a single instance with multiple
> volumes and have data replicated volume-to-volume (by way of a single
> instance)? And if so, why? Is a lack of stable volume-to-volume replication
> a limitation of certain hypervisors?
>
> Or has this area just not been explored in depth within OpenStack
> environments yet?
>
> 1: https://wiki.openstack.org/wiki/CinderSupportMatrix
> 2: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.html
>
>
> On Mon, Feb 8, 2016 at 4:10 PM, Robert Starmer <robert at kumul.us> wrote:
>
>> I'm not against Ceph, but even 2 machines (and really 2 machines with
>> enough storage to be meaningful, e.g. not the all blade environments I've
>> built some o7k  systems on) may not be available for storage, so there are
>> cases where that's not necessarily the solution. I built resiliency in one
>> environment with a 2 node controller/Glance/db system with Gluster, which
>> enabled enough middleware resiliency to meet the customers recovery
>> expectations. Regardless, even with a cattle application model, the
>> infrastructure middleware still needs to be able to provide some level of
>> resiliency.
>>
>> But we've kind-of wandered off of the original question. I think that to
>> bring this back on topic, I think users can build resilience in their own
>> storage construction, but I still think there are use cases where the
>> middleware either needs to use it's own resiliency layer, and/or may end up
>> providing it for the end user.
>>
>> R
>>
>> On Mon, Feb 8, 2016 at 3:51 PM, Fox, Kevin M <Kevin.Fox at pnnl.gov> wrote:
>>
>>> We've used ceph to address the storage requirement in small clouds
>>> pretty well. it works pretty well with only two storage nodes with
>>> replication set to 2, and because of the radosgw, you can share your small
>>> amount of storage between the object store and the block store avoiding the
>>> need to overprovision swift-only or cinder-only to handle usage unknowns.
>>> Its just one pool of storage.
>>>
>>> Your right, using lvm is like telling your users, don't do pets, but
>>> then having pets at the heart of your system. when you loose one, you loose
>>> a lot. With a small ceph, you can take out one of the nodes, burn it to the
>>> ground and put it back, and it just works. No pets.
>>>
>>> Do consider ceph for the small use case.
>>>
>>> Thanks,
>>> Kevin
>>>
>>> ------------------------------
>>> *From:* Robert Starmer [robert at kumul.us]
>>> *Sent:* Monday, February 08, 2016 1:30 PM
>>> *To:* Ned Rhudy
>>> *Cc:* OpenStack Operators
>>>
>>> *Subject:* Re: [Openstack-operators] RAID / stripe block storage volumes
>>>
>>> Ned's model is the model I meant by "multiple underlying storage
>>> services".  Most of the systems I've built are LV/LVM only,  a few added
>>> Ceph as an alternative/live-migration option, and one where we used Gluster
>>> due to size.  Note that the environments I have worked with in general are
>>> small (~20 compute), so huge Ceph environments aren't common.  I am also
>>> working on a project where the storage backend is entirely NFS...
>>>
>>> And I think users are more and more educated to assume that there is
>>> nothing guaranteed.  There is the realization, at least for a good set of
>>> the customers I've worked with (and I try to educate the non-believers),
>>> that the way you get best effect from a system like OpenStack is to
>>> consider everything disposable. The one gap I've seen is that there are
>>> plenty of folks who don't deploy SWIFT, and without some form of object
>>> store, there's still the question of where you place your datasets so that
>>> they can be quickly recovered (and how do you keep them up to date if you
>>> do have one).  With VMs, there's the concept that you can recover quickly
>>> because the "dataset" e.g. your OS, is already there for you, and in plenty
>>> of small environments, that's only as true as the glance repository (guess
>>> what's usually backing that when there's no SWIFT around...).
>>>
>>> So I see the issue as a holistic one. How do you show operators/users
>>> that they should consider everything disposable if we only look at the
>>> current running instance as the "thing"   Somewhere you still likely need
>>> some form of distributed resilience (and yes, I can see using the
>>> distributed Canonical, Centos, RedHat, Fedora, Debian, etc. mirrors as your
>>> distributed Image backup but what about the database content, etc.).
>>>
>>> Robert
>>>
>>> On Mon, Feb 8, 2016 at 1:44 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) <
>>> erhudy at bloomberg.net> wrote:
>>>
>>>> In our environments, we offer two types of storage. Tenants can either
>>>> use Ceph/RBD and trade speed/latency for reliability and protection against
>>>> physical disk failures, or they can launch instances that are realized as
>>>> LVs on an LVM VG that we create on top of a RAID 0 spanning all but the OS
>>>> disk on the hypervisor. This lets the users elect to go all-in on speed and
>>>> sacrifice reliability for applications where replication/HA is handled at
>>>> the app level, if the data on the instance is sourced from elsewhere, or if
>>>> they just don't care much about the data.
>>>>
>>>> There are some further changes to our approach that we would like to
>>>> make down the road, but in general our users seem to like the current
>>>> system and being able to forgo reliability or speed as their circumstances
>>>> demand.
>>>>
>>>> From: joe at topjian.net
>>>> Subject: Re: [Openstack-operators] RAID / stripe block storage volumes
>>>>
>>>> Hi Robert,
>>>>
>>>> Can you elaborate on "multiple underlying storage services"?
>>>>
>>>> The reason I asked the initial question is because historically we've
>>>> made our block storage service resilient to failure. Historically we also
>>>> made our compute environment resilient to failure, too, but over time,
>>>> we've seen users become more educated to cope with compute failure. As a
>>>> result, we've been able to become more lenient with regard to building
>>>> resilient compute environments.
>>>>
>>>> We've been discussing how possible it would be to translate that same
>>>> idea to block storage. Rather than have a large HA storage cluster (whether
>>>> Ceph, Gluster, NetApp, etc), is it possible to offer simple single LVM
>>>> volume servers and push the failure handling on to the user?
>>>>
>>>> Of course, this doesn't work for all types of use cases and
>>>> environments. We still have projects which require the cloud to own most
>>>> responsibility for failure than the users.
>>>>
>>>> But for environments were we offer general purpose / best effort
>>>> compute and storage, what methods are available to help the user be
>>>> resilient to block storage failures?
>>>>
>>>> Joe
>>>>
>>>> On Mon, Feb 8, 2016 at 12:09 PM, Robert Starmer <robert at kumul.us>
>>>> wrote:
>>>>
>>>>> I've always recommended providing multiple underlying storage services
>>>>> to provide this rather than adding the overhead to the VM.  So, not in any
>>>>> of my systems or any I've worked with.
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 5, 2016 at 5:56 PM, Joe Topjian <joe at topjian.net> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Does anyone have users RAID'ing or striping multiple block storage
>>>>>> volumes from within an instance?
>>>>>>
>>>>>> If so, what was the experience? Good, bad, possible but with caveats?
>>>>>>
>>>>>> Thanks,
>>>>>> Joe
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenStack-operators mailing list
>>>>>> OpenStack-operators at lists.openstack.org
>>>>>>
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing listOpenStack-operators at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160208/66bc476c/attachment.html>


More information about the OpenStack-operators mailing list