[Openstack] Hadoop/Swift/Savanna native filesystem integration

Lillie Ross-CDSR11 Ross.Lillie at motorolasolutions.com
Wed Feb 26 03:12:12 UTC 2014


Andrew,

Thanks loads for verifying this, and giving me pointers for work around. I may contact you if I have any other issues, but the 'fixes' you gave me should get us up a running!

Doing an upgrade from Essex to Havana, integrating Neutron, and dealing w Hadoop, I'm never sure if it's a bug on my part or a real issue. Thanks loads.

Ross

Thanks again and regards
Ross

(finger tapped on my iPhone)

On Feb 25, 2014, at 7:41 PM, "Andrew Lazarev" <alazarev at mirantis.com<mailto:alazarev at mirantis.com>> wrote:

Hi Ross,

I tried your scenario and was able to reproduce the issue. So, I filed bug https://bugs.launchpad.net/savanna/+bug/1284906 to track that.

Root cause of the issue is that auth check is missing in getData method of swift driver for hadoop. So, it fails to read data if this is the first requested operation (driver checks if destination directory exists when destination is in swift, that's why swift->swift works fine).

I've created patch for hadoop-swift driver: https://review.openstack.org/#/c/76405/. Jar and images will be updated once it is reviewed by community and merged.

I see several workarounds for you:
1. Bug is in recent code that adds data locality feature support. If you don't plan to use data locality in initial testing you can use older version of hadoop-swift driver (http://savanna-files.mirantis.com/hadoop-swift/hadoop-swift-2013-07-08.jar). You should place this file on all nodes as /usr/share/hadoop/lib/hadoop-swift.jar.
2. Issue happens only if data locality is disabled. You can enable data locality in savanna. Refer to savanna docs how to do it.
3. I can send you jar with issue fixed (or you can build it by yourself from code with https://review.openstack.org/#/c/76405/ applied).

Thanks,
Andrew.


From: Lillie Ross-CDSR11 <Ross.Lillie at motorolasolutions.com<mailto:Ross.Lillie at motorolasolutions.com>>
Date: Wed, Feb 26, 2014 at 2:21 AM
Subject: [Openstack] Hadoop/Swift/Savanna native filesystem integration
To: "openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>" <openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>>


All,

I’m trying to integrate and test access to/from our Swift storage ring via our Hadoop cluster.

After setting the HDFS permissions, I’m able to perform the following tests using the Swift/Hadoop file system:

distcp swift://{container}.{provider}/object swift://{container}.{provider}/new-object
distcp hdfs://{node}:{port}/object swift://{container}.{provider}/new-object

However, I’m not able to use distcp to move objects from Swift to HDFS storage. The log file from the final attempt to create the associated map task is as follows:

2014-02-25 22:10:38,570 DEBUG org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem: Initializing SwiftNativeFileSystem against URI
swift://data.os/blob1 and working dir swift://data.os/user/ubuntu
2014-02-25 22:10:38,593 DEBUG org.apache.hadoop.fs.swift.http.RestClientBindings: Filesystem swift://data.os/blob1 is using configurati
on keys fs.swift.service.os
2014-02-25 22:10:38,594 DEBUG org.apache.hadoop.fs.swift.http.SwiftRestClient: Service={os} container={data} uri={http://KEYSTONE:5
000/v2.0/tokens} tenant={test} user={tester} region={(none)} publicURL={true} location aware={false} partition size={4718592 KB}, buff
er size={64 KB} block size={32768 KB} connect timeout={15000}, retry count={3} socket timeout={60000} throttle delay={0}
2014-02-25 22:10:38,594 DEBUG org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem: SwiftFileSystem initialized
2014-02-25 22:10:38,599 DEBUG org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore: Reading swift://data.os/blob1 from proxy n
ode
2014-02-25 22:10:38,607 INFO org.apache.hadoop.tools.DistCp: FAIL blob1 : org.apache.hadoop.fs.swift.exceptions.SwiftInternalStateExcep
tion: Null Endpoint -client is not authenticated
at org.apache.hadoop.fs.swift.http.SwiftRestClient.checkNotNull(SwiftRestClient.java:1800)
at org.apache.hadoop.fs.swift.http.SwiftRestClient.pathToURI(SwiftRestClient.java:1629)
at org.apache.hadoop.fs.swift.http.SwiftRestClient.pathToURI(SwiftRestClient.java:1669)
at org.apache.hadoop.fs.swift.http.SwiftRestClient.getData(SwiftRestClient.java:711)
at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObject(SwiftNativeFileSystemStore.java:294)
at org.apache.hadoop.fs.swift.snative.SwiftNativeInputStream.<init>(SwiftNativeInputStream.java:103)
at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:555)
at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:419)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

2014-02-25 22:10:47,700 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceR
etainSize=-1
2014-02-25 22:10:47,723 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:tester cause:java.io.IOExc
eption: Copied: 0 Skipped: 0 Failed: 1
2014-02-25 22:10:47,725 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
2014-02-25 22:10:47,739 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
ubuntu at slave1:/var/log/hadoop/ubuntu/userlogs/job_201402252129_0006/attempt_201402252129_0006_m_000000_3$

The hadoop distcp command entered was:

master$ hadoop distcp -D fs.swift.service.os.public=true -D fs.swift.service.os.tenant=test -D fs.swift.service.os.username=tester -D fs.swift.service.os.password=supersecret swift://data.os/blob1 hdfs://<hdfs ip>:9000/data/blob3

Am I missing anything obvious?

Thanks and regards,
Ross



--
Ross Lillie
Distinguished Member of Technical Staff
Motorola Solutions, Inc.

motorolasolutions.com<http://motorolasolutions.com>
O: +1.847.576.0012<tel:%2B1.847.576.0012>
M: +1.847.980.2241<tel:%2B1.847.980.2241>
E: ross.lillie at motorolasolutions.com<mailto:ross.lillie at motorolasolutions.com>


<MSI-Email-Identity-sm.png>


_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140226/f070c6fe/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MSI-Email-Identity-sm.png
Type: image/png
Size: 10441 bytes
Desc: MSI-Email-Identity-sm.png
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140226/f070c6fe/attachment.png>


More information about the Openstack mailing list