[Openstack-operators] Cinde block storage HA

Juan José Pavlik Salles jjpavlik at gmail.com
Tue Sep 16 19:23:56 UTC 2014


I'm not sure to be getting your idea here, how would you do it with NFS?
Who would be the NFS exporting server?

As you can see I don't have too much experience with cinder at all, we've
been using the LVM driver since we installed it a year and a half ago.

2014-09-16 16:09 GMT-03:00 Abel Lopez <alopgeek at gmail.com>:

> Some of your concerns might be addressed by switching to NFS as the
> protocol.
> You’re already exporting large luns to your cinder-volume servers, using
> NFS they would both be writeable by both nodes, so if one goes down, there
> is no need to “swing luns over”
>
> On Sep 16, 2014, at 11:27 AM, Juan José Pavlik Salles <jjpavlik at gmail.com>
> wrote:
>
> Hi Abel, I thought about trying it, but We had MANY performance problems
> with the EMC because of running too many LUNs that's way we`d like to avoid
> that scenario. It might seem the best solution but We don't want to go that
> way again.
>
> 2014-09-16 15:20 GMT-03:00 Abel Lopez <alopgeek at gmail.com>:
>
>> Have you tried using the native Emc drivers? That way cinder only acts as
>> a broker between your instances and the storage back end, and you don't
>> need to worry about your cinder-volume service being HA. (As much)
>>
>>
>> On Tuesday, September 16, 2014, Juan José Pavlik Salles <
>> jjpavlik at gmail.com> wrote:
>>
>>> Hi guys, I'm trying to put some HA on our cinder service, we have the
>>> next scenario:
>>>
>>> -Real backends: EMC clarion (SATA drives) and HP Storevirtual P4000 (SAS
>>> drives), this two backends export 2 big LUNs to our (one and only right
>>> now) cinder server.
>>> -Once these big LUNs are imported in the cinder server, two different VG
>>> are created for two different cinder LVM drivers (cinder-volumes-1 and
>>> cinder-volumes-2). This way I have two different storage resources to give
>>> to my tenants.
>>>
>>> What I want is to deploy a second cinder server to act as failover of
>>> the first one. Both servers are identical. So far I'm running a few tests
>>> with isolated VMs.
>>>
>>> -I installed corosync+pacemaker in 2 VMs, added a Virtual IP.
>>> -Imported in the VMs a LUN with iSCSI created a VG
>>> -Exported a LV with tgt. More or less the same scenario we have on
>>> production.
>>>
>>> If one of the VMs die the second one picks the virtual IP throughtout
>>> tgt is exporting the LUN and the iSCSI session doesn't die, here you can
>>> see part of the logs where the LUN is being imported:
>>>
>>> Sep 16 14:29:50 borrar-nfs kernel: [86630.416160]  connection1:0: ping
>>> timeout of 5 secs expired, recv timeout 5, last rx 4316547395, last ping
>>> 4316548646, now 4316549900
>>> Sep 16 14:29:50 borrar-nfs kernel: [86630.418938]  connection1:0:
>>> detected conn error (1011)
>>> Sep 16 14:29:51 borrar-nfs iscsid: Kernel reported iSCSI connection 1:0
>>> error (1011) state (3)
>>> Sep 16 14:29:53 borrar-nfs iscsid: connection1:0 is operational after
>>> recovery (1 attempts)
>>>
>>> This test was really simple, just one 1GB LUN but it worked ok, even
>>> when the failover was tested during a writing operation.
>>>
>>> So it seems to be a good-so-far-solution, but there are a few things
>>> that worries me a bit:
>>>
>>> -Timeouts? How much time do I have to detect the problem and move the IP
>>> to the new node before the iscsi connections die. I think I could play a
>>> little bit with timeo.noop_out_timeout in iscsid.conf
>>> -What if there was a write operation going on while a node failed, what
>>> if this operation never reached the real backends, could I come across some
>>> inconsistencies in the volume FS? Any recommendations?
>>> -If I create a volume in cinder, the proper target file is created
>>> in /var/lib/cinder/volumes/volue-* but, I need the file to be created in
>>> both cinder nodes in case one of them fail. What would be a proper solution
>>> for this? shared storage for the directory? SVN?
>>> -Both servers should be running tgt at the same time or maybe I should
>>> start tgt on the failover server once the virtual IP is changed?
>>>
>>> Any comments or suggestions will be more than appreciated. Thanks!
>>>
>>> --
>>> Pavlik Salles Juan José
>>> Blog - http://viviendolared.blogspot.com
>>>
>>
>
>
> --
> Pavlik Salles Juan José
> Blog - http://viviendolared.blogspot.com
>
>
>


-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140916/563366a7/attachment.html>


More information about the OpenStack-operators mailing list