[Openstack-operators] Avoiding storage redundancy with Openstack redundant storage and HDFS 3xreplication
Edmon Begoli
ebegoli at gmail.com
Sat Nov 12 03:30:00 UTC 2011
A question related to standing up cloud infrastructure for running Hadoop/HDFS.
We are building up an infrastructure using Openstack which has its own
storage management redundancy.
We are planning to use Openstack to instantiate Hadoop nodes (HDFS,
M/R tasks, Hive, HBase)
on demand.
The problem is that HDFS by design creates three copies of the data,
so there is a 4x times redundancy
which we would prefer to avoid.
I am asking here if anyone has had a similar case and if anyone has
had any helpful solution to recommend.
Thank you in advance,
Edmon
More information about the Openstack-operators
mailing list