[Openstack-operators] Cinder performance and only point of failure
Tom Fifield
tom at openstack.org
Mon Jun 24 04:36:25 UTC 2013
Hi,
I'm not a cinder expert, but just to specifically address the problem of
having to use a single large LUN for /var/lib/nova/instances ...
For live migration, have you considered that you don't necessarily need
the capability to move VMs from every physical machine to every other
physical machine? Maybe it is enough to just be able to move VMs around
within a 'pool' of N physical servers? If this is true, you may be able
to segregate your cloud into several pieces and have separate
/var/lib/nova/instances LUNS for each segregated piece.
There are several ways to do this, which are introduced at a high level
here:
http://docs.openstack.org/trunk/openstack-ops/content/scaling.html#segregate_cloud
Also, as it seems like you are migrating from one environment to
another, you might be able to progressively add compute nodes to test
the scalability. If the massive LUN works at the scale you need, maybe
you can avoid the complexity ...
Regards,
Tom
On 22/06/13 05:50, � wrote:
> Hi guys, i'd like to know your opinions about our situation and why we
> want to deploy Grizzly.
>
> Our production environment looks like this:
>
> -Everything is Linux.
> -2 storage servers, EMC (SATA disks) and HP (SAS disks), using iSCSI
> exporting many LUNs to the dom0.
> -10 dom0 running between 15 to 25 VMs each with XEN.
> -Every dom0 mounts around 8 500Gbytes LUNs with ocfs2 where the small VMs are.
> -Some special VMs have their own LUNs because of the space they need,
> when a VM needs more than 100Gbytes we create a dedicated LUN for it.
> -Having a clustered file system (ocfs2) let us use xen live-migration
> for the small VMs.
> -We have around 200 VMs running, and is getting a bit complicated to
> manage with the actual infrastructure.
>
> Right now we are working on moving to Openstack (Grizzly) so this is
> what we have (this is a testing environment):
>
> -1 controller node with HAproxy for API balancing, MySQL, memcached,
> rabbitMQ and quantum server. This controller won't be the only one on
> the final deployment.
> -1 compute node. Here we run most of the APIs and the nova services.
> -1 cinder server for block storage, using multibackend. We've defined
> two backends one for each storage server. This server has 4 1Gbit
> NICs, two of them are bond0 connected to the cloud network, and the
> other 2 are bond1 connected to the SAN.
>
> As you know, ephimeral disks are saved in /var/lib/nova/instances so
> if we wanted to use live migration we should run ocfs2 to share this
> directory between the compute nodes (which is kind of what we have
> right now in production). BUT we can only use 1 LUN to mount on this
> directory, this way we should create a huge one (more than 2 Tbytes)
> to hold all the VMs, which may lead to performance problems.
>
> We could also use cinder volumes for the VMs, maybe spliting things
> like: system file disk (ephimeral) and application files disk (cinder
> volumes). This way we'd have hundreds of iSCSI volumes managed by
> cinder (LVM+EMC and LVM+HP) around one for each VM. What do you think
> about this solutions performance???
>
> Compute node <-> iSCSI <-> Cinder server (LVM) <-> iSCSI <-> Storage Servers
>
> Any comment/experience/suggestion/anything will be more than helpful! Thanks
>
> --
> Pavlik Salles Juan Jos�
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
More information about the OpenStack-operators
mailing list