Re: [openstack-hpc] Questions about porting StarCluster to OpenStack

19 Apr 2014

      We are in need of more advanced multi-tier scheduling capability to 
properly place multiple streaming tiers on heterogeneous hardware. The 
top level must be scheduled like Storm or S4, as a coarse grained data 
flow, but individual operators must be scheduled on specialized 
hardware pools. We are looking at integrating this into the Nova 
scheduler, so that it is available as a standard capability of an 
OpenStack cloud. 

So, giving the context of this StarCluster project, what would be the 
pros and cons of forward porting an old cluster scheduler instead of 
incorporating this into the Nova scheduler itself?

On Fri, 18 Apr 2014 16:38:35 -0400, Jonathan Proulx <jon@jonproulx.com> wrote:
Hi All,
...
For those who don't know me I deployed and run the OpenStack cloud at
MIT CSAIL (http://www.csail.mit.edu)
I'm currently working with Justin Riley (address in the "To" header of
this email)
who's the primary developer for StarCluster
[http://star.mit.edu/cluster/index.html] on porting StarCluster to
OpenStack.  We have some operational and use case questions before we
get too far down the implementation rathole^H^H^H^H^H^H^H path.
I've sent this to openstack-hpc and some select bcc's to get a sense
of who might be interest and what opinions you have about how the
port should be implemented.
What is it?
If your not familiar, StarCluster is is an open source
cluster-computing toolkit for Amazon’s Elastic Compute Cloud (EC2)
released under the LGPL license. It has been designed to automate and
simplify the process of building, configuring, and managing clusters
of virtual machines on Amazon’s EC2 cloud. StarCluster allows anyone
to easily create a cluster computing environment in the cloud suited
for distributed and parallel computing applications and systems.
It's main target audience is domain scientists who want to setup a
SGE, Condor, Hadoop, and a few other sorts of cluster in "the cloud".
Its current implementation is basically a config file driven CLI that
uses locally stored ssh keys for managing running the cluster.  It
has a fair sized user community and people seem to like it, which is
why one of my users introduced Justin and I so I could provide a place
for Justin to work and my user could get StarCluster on our private
cloud.
Were is it at?
The CLI "basically works" on OpenStack now, though so far no end-users
have touched it so not quite ready for public beta.  If you're really
interested in early code I'm sure Justin will be happy to share if you
ask nicely.
He has also been working on a Horizon dashboard which would be an
additional feature for the OpenStack version (there is no EC2 GUI).
The BIG QUESTIONS?
Do you think this is something that would be useful to your users?
To enable full functionality in the Horizon dashboard the dashboard
app needs access to a private key to access the running (virtual)
cluster nodes as root.
Waving hands over implementation details, and assuming you're
interested in this functionality of you're this far into the email. 
Is storing key material a show stopper in your environment?  In other
words would you rather just fall back to the CLI with it's local
~/.ssh/id_rsa (or equivalent) for privileged operations?
My Opinion...
I don't like centrally storing crypto keys, but my horizon runs on my
controller node so it is already a fairly privileged zone, and I don't
necessarily trust my users to store their key material even as well as
this could be done.  So while a bit cautious about the details I think
it is an acceptable risk in my (admittedly permissive) environment.
Thoughts?
-Jon
_______________________________________________
OpenStack-HPC mailing list
OpenStack-HPC@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-hpc

Re: [openstack-hpc] Questions about porting StarCluster to OpenStack

theo＠stillwater-sc.com