[openstack-dev] [Savanna] Spark plugin status
Daniele.Venzano at eurecom.fr
Thu Jan 9 10:47:58 UTC 2014
On 01/09/14 09:41, Sergey Lukjanov wrote:
> On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano
> <Daniele.Venzano at eurecom.fr <mailto:Daniele.Venzano at eurecom.fr>> wrote:
> we are finishing up the development of the Spark plugin for Savanna.
> In the next few days we will deploy it on an OpenStack cluster with
> real users to iron out the last few things. Hopefully next week we
> will put the code on a public github repository in beta status.
> [SL] Awesome! Could you, please, share some info this installation if
> possible? like OpenStack cluster version and size, Savanna version,
> expected Spark cluster sizes and lifecycle, etc.
As part of the Bigfoot project that is funding us
(http://bigfootproject.eu/) we have a research OpenStack cluster with 6
compute nodes, hopefully with more coming. The machines have 16 CPUs, 32
with hyperthreading, and 128GB of RAM.
OpenStack is the Ubuntu cloud version (Grizzly 2013.1.4), but Horizon
and Keystone are on the latest Havana branch versions. It uses KVM and
the openvswitch plugin for networking.
For Savanna, we stayed with a version from git that was working for us,
after 0.3, but now a couple of months old. Part of the work I need to do
is merging with the current Savanna master branch.
We have five users that are interested in running Spark jobs and at
least one has already been doing so on the Bigfoot platform with a
cluster created by hand.
We will start with two of them and then let in the others. One will use
a small cluster with 3 nodes, the other with about ten nodes.
We also plan to run a few tests with various sizes of clusters, mainly
to measure performance in various conditions.
> [SL] You can use diskimage-builder  to prepare such images, we're
> already using it for building images for vanilla plugin .
Yes, I had a quick look and from what I understand we will need to
modify the scripts that build the images. We will make a separate change
request for that.
> [SL] Absolutely, it's a very interesting tool for data processing. IMO
> the best way is to create a change request to savanna for code review
> and discussion in gerrit, it'll be really the most effective way to
> collaborate. As for the best way of integration with Savanna - we're
> expecting to see it in the openstack/savanna repo like vanilla, HDP and
> IDH (which will be landed soon) plugins.
Nice! I will contact you when I am ready to create the github repo, so
that I do it right for the review process.
More information about the OpenStack-dev