[Openstack] Hadoop/Swift/Savanna native filesystem integration
Andrew Lazarev
alazarev at mirantis.com
Wed Feb 26 01:25:14 UTC 2014
Hi Ross,
I tried your scenario and was able to reproduce the issue. So, I filed bug
https://bugs.launchpad.net/savanna/+bug/1284906 to track that.
Root cause of the issue is that auth check is missing in getData method of
swift driver for hadoop. So, it fails to read data if this is the first
requested operation (driver checks if destination directory exists when
destination is in swift, that's why swift->swift works fine).
I've created patch for hadoop-swift driver:
https://review.openstack.org/#/c/76405/. Jar and images will be updated
once it is reviewed by community and merged.
I see several workarounds for you:
1. Bug is in recent code that adds data locality feature support. If you
don't plan to use data locality in initial testing you can use older
version of hadoop-swift driver (
http://savanna-files.mirantis.com/hadoop-swift/hadoop-swift-2013-07-08.jar).
You should place this file on all nodes as
/usr/share/hadoop/lib/hadoop-swift.jar.
2. Issue happens only if data locality is disabled. You can enable data
locality in savanna. Refer to savanna docs how to do it.
3. I can send you jar with issue fixed (or you can build it by yourself
from code with https://review.openstack.org/#/c/76405/ applied).
Thanks,
Andrew.
From: Lillie Ross-CDSR11 <Ross.Lillie at motorolasolutions.com>
> Date: Wed, Feb 26, 2014 at 2:21 AM
> Subject: [Openstack] Hadoop/Swift/Savanna native filesystem integration
> To: "openstack at lists.openstack.org" <openstack at lists.openstack.org>
>
>
> All,
>
> I'm trying to integrate and test access to/from our Swift storage ring
> via our Hadoop cluster.
>
> After setting the HDFS permissions, I'm able to perform the following
> tests using the Swift/Hadoop file system:
>
> distcp swift://{container}.{provider}/object
> swift://{container}.{provider}/new-object
> distcp hdfs://{node}:{port}/object
> swift://{container}.{provider}/new-object
>
> However, I'm not able to use distcp to move objects from Swift to HDFS
> storage. The log file from the final attempt to create the associated map
> task is as follows:
>
> 2014-02-25 22:10:38,570 DEBUG
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem: Initializing
> SwiftNativeFileSystem against URI
> swift://data.os/blob1 and working dir swift://data.os/user/ubuntu
> 2014-02-25 22:10:38,593 DEBUG
> org.apache.hadoop.fs.swift.http.RestClientBindings: Filesystem
> swift://data.os/blob1 is using configurati
> on keys fs.swift.service.os
> 2014-02-25 22:10:38,594 DEBUG
> org.apache.hadoop.fs.swift.http.SwiftRestClient: Service={os}
> container={data} uri={http://KEYSTONE:5
> 000/v2.0/tokens} tenant={test} user={tester} region={(none)}
> publicURL={true} location aware={false} partition size={4718592 KB}, buff
> er size={64 KB} block size={32768 KB} connect timeout={15000}, retry
> count={3} socket timeout={60000} throttle delay={0}
> 2014-02-25 22:10:38,594 DEBUG
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem: SwiftFileSystem
> initialized
> 2014-02-25 22:10:38,599 DEBUG
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore: Reading
> swift://data.os/blob1 from proxy n
> ode
> 2014-02-25 22:10:38,607 INFO org.apache.hadoop.tools.DistCp: FAIL blob1 :
> org.apache.hadoop.fs.swift.exceptions.SwiftInternalStateExcep
> tion: Null Endpoint -client is not authenticated
> at
> org.apache.hadoop.fs.swift.http.SwiftRestClient.checkNotNull(SwiftRestClient.java:1800)
> at
> org.apache.hadoop.fs.swift.http.SwiftRestClient.pathToURI(SwiftRestClient.java:1629)
> at
> org.apache.hadoop.fs.swift.http.SwiftRestClient.pathToURI(SwiftRestClient.java:1669)
> at
> org.apache.hadoop.fs.swift.http.SwiftRestClient.getData(SwiftRestClient.java:711)
> at
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObject(SwiftNativeFileSystemStore.java:294)
> at
> org.apache.hadoop.fs.swift.snative.SwiftNativeInputStream.<init>(SwiftNativeInputStream.java:103)
> at
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:555)
> at
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:536)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:419)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> 2014-02-25 22:10:47,700 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceR
> etainSize=-1
> 2014-02-25 22:10:47,723 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:tester cause:java.io.IOExc
> eption: Copied: 0 Skipped: 0 Failed: 1
> 2014-02-25 22:10:47,725 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2014-02-25 22:10:47,739 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
> ubuntu at slave1
> :/var/log/hadoop/ubuntu/userlogs/job_201402252129_0006/attempt_201402252129_0006_m_000000_3$
>
> The hadoop distcp command entered was:
>
> master$ hadoop distcp -D fs.swift.service.os.public=true -D
> fs.swift.service.os.tenant=test -D fs.swift.service.os.username=tester
> -D fs.swift.service.os.password=supersecret swift://data.os/blob1hdfs://<hdfs ip>:9000/data/blob3
>
> Am I missing anything obvious?
>
> Thanks and regards,
> Ross
>
>
>
> --
> Ross Lillie
> Distinguished Member of Technical Staff
> Motorola Solutions, Inc.
>
> motorolasolutions.com
> O: +1.847.576.0012
> M: +1.847.980.2241
> E: ross.lillie at motorolasolutions.com
>
>
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140225/0989ee99/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MSI-Email-Identity-sm.png
Type: image/png
Size: 10441 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140225/0989ee99/attachment.png>
More information about the Openstack
mailing list