[Openstack] Is Openstack suitable to my problem?

Thiago Moraes thiago.camposmoraes at gmail.com
Mon Aug 15 04:07:30 UTC 2011


I took a look at some distributed file systems and went a little deeper in
Hadoop and his HDFS, for instance. I don't really need full POSIX
compliance, but having a nested structure is important, but as far as I know
there are way to simulate this on Switf, is that correct?

The problem I see in using something like hadoop is the single point of
failure, not because I need almost 100% availability, but because the people
who will access the data does not belong to the same organization. They will
be researchers from different institutions that may want to deploy a local
server with a subset of the data to improve their productivity, but the data
set's size makes impractical to just copy everything.

The plan would be that the interface to the system would show which files
are stored locally and which are not, so that everyone gets access to
everything, almost like a peer to peer system where they download from the
closest source and then store for their own use.

At first, I though of implementing something by hand, but using an already
mature solution makes a lot more sense.

So, is this plausible or am I trying to use the wrong tools?

thanks, again

Thiago Moraes - EnC 07 - UFSCar


2011/8/14 Todd Deshane <todd.deshane at xen.org>

> On Sun, Aug 14, 2011 at 4:10 AM, Thiago Moraes
> <thiago.camposmoraes at gmail.com> wrote:
> > Hey guys,
> >
> > I'm new on the list and I'm currently considering Openstack to solve a
> data
> > distribution problem. Right now, there's a server which contains very
> large
> > files (usual files have 30GB or even more). This server is accessed by
> LAN
> > and over the internet but, of course, it's difficult to do this without
> > local connection.
> >
> > My idea to solve this problem is to deploy new servers on the places
> which
> > access data more often in an such a way that they get a local copy of the
> > most accessed part of data by then. In my head, I consider that there
> will
> > be N different clouds, one at my location and the others spread on
> another
> > networks. Then, these new clouds would download and store parts of the
> data
> > (entire files) so that they can be accessed through their own LAN.
> >
>
> It sounds like you are looking for the functionality that Zones (aim
> to?) provide.
>
> Take a look at:
>
> http://wiki.openstack.org/MultiClusterZones
>
>
> > Is Openstack suitable in this environment? Anyone would recommend another
> > solution?
> >
>
> Have you also looked at SheepDog, Hadoop or HC2? All of these seem to
> have some OpenStack integration points as well.
>
> Some links to look into:
> http://wiki.openstack.org/SheepdogSupport
> http://doubleclix.wordpress.com/2011/03/17/hadoop-2-0-openstack-pbj/
>
> http://www.quora.com/What-features-differentiate-HDFS-and-OpenStack-Object-Storage
>
>
> Hope that helps.
>
> Thanks,
> Todd
>
> > PS: I know the file size limitations of 5GB. I just need that all parts
> of a
> > file to be in the same local area network so that a blazingly fast
> Internet
> > connection is not required all the time.
> >
> > thanks,
> >
> >
> > Thiago Moraes - EnC 07 - UFSCar
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack at lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
> >
> >
>
>
>
> --
> Todd Deshane
> http://www.linkedin.com/in/deshantm
> http://www.xen.org/products/cloudxen.html
> http://runningxen.com/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20110815/8febfd3b/attachment.html>


More information about the Openstack mailing list