[Openstack-operators] Avoiding storage redundancy with Openstack redundant storage and HDFS 3xreplication

andi abes andi.abes at gmail.com
Mon Nov 14 20:02:08 UTC 2011


Replication in hdfs is done not just for failure recovery, but mostly for
performance - keep the data close to the computation - ideally MR steps can
access only local storage.

If you're trying to run this on top of openstack you might want to create
volumes that are not replicated at the openstack layer on which hdfs runs,
which are separate from the more redundant volumes used for the VM's
operating system.




On Mon, Nov 14, 2011 at 8:42 AM, Galloway, Michael D.
<gallowaymd at ornl.gov>wrote:

> Not sure it's a real answer. But you can set the replication in HDFS to be
> what you want, 1, 2, 3, etc. not sure HDFS replication 1 makes sense for
> your application, but it is configurable.
>
> --- michael
>
> -----Original Message-----
> From: openstack-operators-bounces at lists.openstack.org [mailto:
> openstack-operators-bounces at lists.openstack.org] On Behalf Of Edmon Begoli
> Sent: Friday, November 11, 2011 10:30 PM
> To: openstack-operators at lists.openstack.org
> Subject: [Openstack-operators] Avoiding storage redundancy with Openstack
> redundant storage and HDFS 3xreplication
>
> A question related to standing up cloud infrastructure for running
> Hadoop/HDFS.
>
> We are building up an infrastructure using Openstack which has its own
> storage management redundancy.
>
> We are planning to use Openstack to instantiate Hadoop nodes (HDFS,
> M/R tasks, Hive, HBase)
> on demand.
>
> The problem is that HDFS by design creates three copies of the data,
> so there is a 4x times redundancy
> which we would prefer to avoid.
>
> I am asking here if anyone has had a similar case and if anyone has
> had any helpful solution to recommend.
>
> Thank you in advance,
> Edmon
> _______________________________________________
> Openstack-operators mailing list
> Openstack-operators at lists.openstack.org
> hxxp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> _______________________________________________
> Openstack-operators mailing list
> Openstack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20111114/3358b5ac/attachment-0002.html>


More information about the Openstack-operators mailing list