[Openstack-operators] Cinder performance and only point of failure

Juan José Pavlik Salles jjpavlik at gmail.com
Mon Jun 24 16:57:25 UTC 2013


I've been thinking about live migration, i'm pretty sure we don't
strictly need it (right now with xen live migration works just between
some of the nodes... so it's not that used) but is something we'd like
to have. I mean live migration is something really usefull for
situation where you have to do some maintenance on a server and you
don't want to stop the services running on its VMs. I've never thought
segregation for this but now you mention it, looks like a good choice.

About the migration, we are definitely doing it sloooooowwwwlyyyy,
because we are really newbies with openstack so, one step at the time.
We even plan to run both clouds until we are pretty sure the new one
can handle everything.

A couple of years ago, we had problems with the emc storage running
too many LUNs (one LUN per drive, so it was something around  400
LUNs, and using multipath on the compute nodes... 4 paths for each
drive... pretty messy) so we gradually changed to this new schema with
less but bigger LUNs. Someone from EMC said we should do that, and
recommended us 500Gbyte LUN size. I have to say that the changes
really improved our situation, so i wouldn't like to go back.

Thanks!

2013/6/24 Tom Fifield <tom at openstack.org>:
> Hi,
>
> I'm not a cinder expert, but just to specifically address the problem of
> having to use a single large LUN for /var/lib/nova/instances ...
>
> For live migration, have you considered that you don't necessarily need the
> capability to move VMs from every physical machine to every other physical
> machine? Maybe it is enough to just be able to move VMs around within a
> 'pool' of N physical servers? If this is true, you may be able to segregate
> your cloud into several pieces and have separate /var/lib/nova/instances
> LUNS for each segregated piece.
>
> There are several ways to do this, which are introduced at a high level
> here:
> http://docs.openstack.org/trunk/openstack-ops/content/scaling.html#segregate_cloud
>
> Also, as it seems like you are migrating from one environment to another,
> you might be able to progressively add compute nodes to test the
> scalability. If the massive LUN works at the scale you need, maybe you can
> avoid the complexity ...
>
>
> Regards,
>
>
> Tom
>
>
>
> On 22/06/13 05:50, � wrote:
>>
>> Hi guys, i'd like to know your opinions about our situation and why we
>> want to deploy Grizzly.
>>
>> Our production environment looks like this:
>>
>> -Everything is Linux.
>> -2 storage servers, EMC (SATA disks) and HP (SAS disks), using iSCSI
>> exporting many LUNs to the dom0.
>> -10 dom0 running between 15 to 25 VMs each with XEN.
>> -Every dom0 mounts around 8 500Gbytes LUNs with ocfs2 where the small VMs
>> are.
>> -Some special VMs have their own LUNs because of the space they need,
>> when a VM needs more than 100Gbytes we create a dedicated LUN for it.
>> -Having a clustered file system (ocfs2) let us use xen live-migration
>> for the small VMs.
>> -We have around 200 VMs running, and is getting a bit complicated to
>> manage with the actual infrastructure.
>>
>> Right now we are working on moving to Openstack (Grizzly) so this is
>> what we have (this is a testing environment):
>>
>> -1 controller node with HAproxy for API balancing, MySQL, memcached,
>> rabbitMQ and quantum server. This controller won't be the only one on
>> the final deployment.
>> -1 compute node. Here we run most of the APIs and the nova services.
>> -1 cinder server for block storage, using multibackend. We've defined
>> two backends one for each storage server. This server has 4 1Gbit
>> NICs, two of them are bond0 connected to the cloud network, and the
>> other 2 are bond1 connected to the SAN.
>>
>> As you know, ephimeral disks are saved in /var/lib/nova/instances so
>> if we wanted to use live migration we should run ocfs2 to share this
>> directory between the compute nodes (which is kind of what we have
>> right now in production). BUT we can only use 1 LUN to mount on this
>> directory, this way we should create a huge one (more than 2 Tbytes)
>> to hold all the VMs, which may lead to performance problems.
>>
>> We could also use cinder volumes for the VMs, maybe spliting things
>> like: system file disk (ephimeral) and application files disk (cinder
>> volumes). This way we'd have hundreds of iSCSI volumes managed by
>> cinder (LVM+EMC and LVM+HP) around one for each VM. What do you think
>> about this solutions performance???
>>
>> Compute node <-> iSCSI <-> Cinder server (LVM) <-> iSCSI <-> Storage
>> Servers
>>
>> Any comment/experience/suggestion/anything will be more than helpful!
>> Thanks
>>
>> --
>> Pavlik Salles Juan Jos�
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Pavlik Salles Juan José



More information about the OpenStack-operators mailing list