[openstack-dev] [trove] Adding support for HBase in Trove

Fox, Kevin M Kevin.Fox at pnnl.gov
Thu Jan 7 17:44:29 UTC 2016

the whole hadoopish stack is unusual though. I suspect users often want to slice and dice all the components that run together on the cluster, where HBase is just one component of the shared cluster. I can totally envision users walking up to my door saying, I provisioned this HBase system with Trove, and now I want to run such and such job on the cluster... Building on top of Sahara enables that kind of thing. If trove wants to do the clustering all itself, then that's either out of the picture, or you end up having to add lots of sahara like functionality in the end to get its functionality back up to where users will want it.

From: michael mccune [msm at redhat.com]
Sent: Thursday, January 07, 2016 8:17 AM
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [trove] Adding support for HBase in Trove

thanks for bringing this up Amrith,

On 01/06/2016 07:31 PM, Fox, Kevin M wrote:
> Having a simple plugin that doesn't depend on all of Sahara, for the case a user only wants a single node HBase does make sense. Its much easier for an Op to support that case if thats all their users ever want. But, thats probably as far as that plugin ever should go. If you need scale up/down, etc, then your starting to reimplement large swaths of Sahara, and like the Cinder plugin for Nova, there could be a plugin that works identically to the stand alone one that converts the same api over to a Sahara compatible one. You then farm the work over to Sahara.

i think this sounds reasonable, as long as we are limiting it to
standalone mode. if the deployments start to take on a larger scope i
agree it would be useful to leverage sahara for provisioning and scaling.

as the hbase installation grows beyond the standalone mode there will
necessarily need to be hdfs and zookeeper support to allow for a proper
production deployment. this also brings up questions of allowing the
end-users to supply configurations for the hdfs and zookeeper processes,
not to mention enabling support for high availability hdfs.

i can envision a scenario where trove could use sahara to provision and
manage the clusters for hbase/hdfs/zk. this does pose some questions as
we'd have to determine how the trove guest agent would be installed on
the nodes, if there will need to be custom configurations used by trove,
and if sahara will need to provide a plugin for bare (meaning no data
processing framework) hbase/hdfs/zk clusters. but, i think these could
be solved by either using custom images or a plugin in sahara that would
install the necessary agents/configurations.

of course, this does add a layer of complexity as operators who wish
this type of deployment will need to have both trove and sahara, but imo
this would be easier than replicating the work that sahara has done with
these technologies.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

More information about the OpenStack-dev mailing list