[openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

Daniele Venzano venza at brownhat.org
Wed Jan 28 16:40:09 UTC 2015


Hello everyone,

there is already some code in our repository:
https://github.com/bigfootproject/savanna-image-elements

I did the necessary changes to have the Spark element use the cdh5
element. I updated also to Spark 1.2. The old cloudera HDFS-only
element is still needed for generating cdh4 images (but probably cdh4
support can be thrown away).

Unfortunately I do not have the time to do the necessary
testing/validation and submit for review. I also changed the CDH
element so that it can install only HDFS, if so required.
The changes I made are simple and all contained in the last commit on
the master branch of that repo.

The image generated with this code runs in Sahara without any further
changes. Feel free to take the code, clean it up and submit for review.

Dan

On Wed, Jan 28, 2015 at 10:43:30AM -0500, Trevor McKay wrote:
> Intel folks,
> 
> Belated welcome to Sahara!  Thank you for your recent commits.
> 
> Moving this thread to openstack-dev so others may contribute, cc'ing
> Daniele and Pietro who pioneered the Spark plugin.
> 
> I'll respond with another email about Oozie work, but I want to
> address the Spark/Swift issue in CDH since I have been working
> on it and there is a task which still needs to be done -- that
> is to upgrade the CDH version in the spark image and see if
> the situation improves (see below)
> 
> Relevant reviews are here:
> 
> https://review.openstack.org/146659
> https://review.openstack.org/147955
> https://review.openstack.org/147985
> https://review.openstack.org/146659
> 
> In the first review, you can see that we set an extra driver
> classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.
> 
> This is because the spark-assembly JAR in CDH4 contains classes from
> jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
> hadoop-swift.jar dereferences a Swift path, it calls into code
> from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
> was removed in jackson-core-asl-1.9.x, so there is an exception.
> 
> Therefore, we need to use the classpath to either upgrade the version of
> jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
> to 1.8.8 (both work in my testing).  However, the first of these options
> requires us to bundle an extra jar.  Since /usr/lib/hadoop already
> contains jackson-core-asl-1.8.8, it is easier to just add that to the
> classpath and downgrade the jackson version.
> 
> Note, there are some references to this problem on the spark mailing list,
> we are not the only ones to encounter it.
> 
> However, I am not completely comfortable with mixing versions and
> patching the classpath this way.  It looks to me like the Spark assembly
> used in CDH5 has consistent versions, and I would like to try updating
> the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
> the problem and removes the need for the extra classpath, that would be
> great.
> 
> Would someone like to take on this change? (modifying sahara-image-elements
> to use CDH5 for Spark images) I can make a blueprint for
> it.
> 
> More to come about Oozie topics.
> 
> Best regards,
> 
> Trevor
> 
> On Thu, 2015-01-15 at 15:34 +0000, Chen, Weiting wrote:
> > Hi Mckay.
> > 
> >  
> > 
> > We are Intel team and contributing OpenStack Sahara project.
> > 
> > We are new in Sahara and would like to do more contributions in this
> > project.
> > 
> > So far, we are focusing on Sahara CDH Plugin.
> > 
> > So if there is any issues related on this, please feel free to discuss
> > with us.
> > 
> >  
> > 
> > During IRC meeting, there are two issues you mentioned and we would
> > like to discuss with you.
> > 
> > 1.      Oozie Workflow Support: 
> > 
> > Do you have any plan could share with us about your idea?
> > 
> > Because in our case, we are testing to run a java action job with
> > HBase library support and also facing some problems about Oozie
> > support.
> > 
> > So it should be good to share the experience with each other.
> > 
> > 
> > 
> > 2.      Spark CDH Issues: 
> > 
> > Could you provide more information about this issue? In CDH Plugin, we
> > have used CDH 5 to finish swift test. So it should be fine to upgrade
> > CDH 4 to 5.
> > 
> >  
> > 
> > 
> 
> 
> 

-- 
Daniele Venzano
http://www.brownhat.org




More information about the OpenStack-dev mailing list