[Openstack-operators] Cinder performance and only point of failure

Tom Fifield tom at openstack.org
Mon Jun 24 04:36:25 UTC 2013


Hi,

I'm not a cinder expert, but just to specifically address the problem of 
having to use a single large LUN for /var/lib/nova/instances ...

For live migration, have you considered that you don't necessarily need 
the capability to move VMs from every physical machine to every other 
physical machine? Maybe it is enough to just be able to move VMs around 
within a 'pool' of N physical servers? If this is true, you may be able 
to segregate your cloud into several pieces and have separate 
/var/lib/nova/instances LUNS for each segregated piece.

There are several ways to do this, which are introduced at a high level 
here: 
http://docs.openstack.org/trunk/openstack-ops/content/scaling.html#segregate_cloud

Also, as it seems like you are migrating from one environment to 
another, you might be able to progressively add compute nodes to test 
the scalability. If the massive LUN works at the scale you need, maybe 
you can avoid the complexity ...


Regards,


Tom


On 22/06/13 05:50, � wrote:
> Hi guys, i'd like to know your opinions about our situation and why we
> want to deploy Grizzly.
>
> Our production environment looks like this:
>
> -Everything is Linux.
> -2 storage servers, EMC (SATA disks) and HP (SAS disks), using iSCSI
> exporting many LUNs to the dom0.
> -10 dom0 running between 15 to 25 VMs each with XEN.
> -Every dom0 mounts around 8 500Gbytes LUNs with ocfs2 where the small VMs are.
> -Some special VMs have their own LUNs because of the space they need,
> when a VM needs more than 100Gbytes we create a dedicated LUN for it.
> -Having a clustered file system (ocfs2) let us use xen live-migration
> for the small VMs.
> -We have around 200 VMs running, and is getting a bit complicated to
> manage with the actual infrastructure.
>
> Right now we are working on moving to Openstack (Grizzly) so this is
> what we have (this is a testing environment):
>
> -1 controller node with HAproxy for API balancing, MySQL, memcached,
> rabbitMQ and quantum server. This controller won't be the only one on
> the final deployment.
> -1 compute node. Here we run most of the APIs and the nova services.
> -1 cinder server for block storage, using multibackend. We've defined
> two backends one for each storage server. This server has 4 1Gbit
> NICs, two of them are bond0 connected to the cloud network, and the
> other 2 are bond1 connected to the SAN.
>
> As you know, ephimeral disks are saved in /var/lib/nova/instances so
> if we wanted to use live migration we should run ocfs2 to share this
> directory between the compute nodes (which is kind of what we have
> right now in production). BUT we can only use 1 LUN to mount on this
> directory, this way we should create a huge one (more than 2 Tbytes)
> to hold all the VMs, which may lead to performance problems.
>
> We could also use cinder volumes for the VMs, maybe spliting things
> like: system file disk (ephimeral) and application files disk (cinder
> volumes). This way we'd have hundreds of iSCSI volumes managed by
> cinder (LVM+EMC and LVM+HP) around one for each VM. What do you think
> about this solutions performance???
>
> Compute node <-> iSCSI <-> Cinder server (LVM) <-> iSCSI <-> Storage Servers
>
> Any comment/experience/suggestion/anything will be more than helpful! Thanks
>
> --
> Pavlik Salles Juan Jos�
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>




More information about the OpenStack-operators mailing list