[openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

Trevor McKay tmckay at redhat.com
Wed Jan 28 15:43:30 UTC 2015

Intel folks,

Belated welcome to Sahara!  Thank you for your recent commits.

Moving this thread to openstack-dev so others may contribute, cc'ing
Daniele and Pietro who pioneered the Spark plugin.

I'll respond with another email about Oozie work, but I want to
address the Spark/Swift issue in CDH since I have been working
on it and there is a task which still needs to be done -- that
is to upgrade the CDH version in the spark image and see if
the situation improves (see below)

Relevant reviews are here:


In the first review, you can see that we set an extra driver
classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.

This is because the spark-assembly JAR in CDH4 contains classes from
jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
hadoop-swift.jar dereferences a Swift path, it calls into code
from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
was removed in jackson-core-asl-1.9.x, so there is an exception.

Therefore, we need to use the classpath to either upgrade the version of
jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
to 1.8.8 (both work in my testing).  However, the first of these options
requires us to bundle an extra jar.  Since /usr/lib/hadoop already
contains jackson-core-asl-1.8.8, it is easier to just add that to the
classpath and downgrade the jackson version.

Note, there are some references to this problem on the spark mailing list,
we are not the only ones to encounter it.

However, I am not completely comfortable with mixing versions and
patching the classpath this way.  It looks to me like the Spark assembly
used in CDH5 has consistent versions, and I would like to try updating
the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
the problem and removes the need for the extra classpath, that would be

Would someone like to take on this change? (modifying sahara-image-elements
to use CDH5 for Spark images) I can make a blueprint for

More to come about Oozie topics.

Best regards,


On Thu, 2015-01-15 at 15:34 +0000, Chen, Weiting wrote:
> Hi Mckay.
> We are Intel team and contributing OpenStack Sahara project.
> We are new in Sahara and would like to do more contributions in this
> project.
> So far, we are focusing on Sahara CDH Plugin.
> So if there is any issues related on this, please feel free to discuss
> with us.
> During IRC meeting, there are two issues you mentioned and we would
> like to discuss with you.
> 1.      Oozie Workflow Support: 
> Do you have any plan could share with us about your idea?
> Because in our case, we are testing to run a java action job with
> HBase library support and also facing some problems about Oozie
> support.
> So it should be good to share the experience with each other.
> 2.      Spark CDH Issues: 
> Could you provide more information about this issue? In CDH Plugin, we
> have used CDH 5 to finish swift test. So it should be fine to upgrade
> CDH 4 to 5.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150128/c134ebc9/attachment.html>

More information about the OpenStack-dev mailing list