[Openstack] Savana/Swift large object copy error

Ross Lillie ross.lillie at motorolasolutions.com
Tue May 5 20:19:40 UTC 2015


As a followup, when performing a distcp from HDFS to Swift, segments ARE
being created in the swift container with a .distcp- prefix. Each temporary
file appears to be related to the attempt of the map/reduce job.

Just as the last temporary segment appears in the remote container, the job
aborts, and of the .distcp- temporary objects are deleted and Hadoop
commences to the next "attempt".

For example, for the currently running test case, the swift container
listing shows the following:

zantac:~ lillie$ swift list --lh backups

2.4G 2015-05-05 20:13:16
.distcp.tmp.attempt_1430771817173_0010_m_000000_0/000001

2.4G 2015-05-05 20:14:45
.distcp.tmp.attempt_1430771817173_0010_m_000000_0/000002

2.4G 2015-05-05 20:16:15
.distcp.tmp.attempt_1430771817173_0010_m_000000_0/000003

7.2G

Once the entire file is "copied", the operation reports Error 413, and all
of the above files are deleted. It's as though the Swift file system isn't
able to close the file.


/ross

On Tue, May 5, 2015 at 2:54 PM, Ross Lillie <
ross.lillie at motorolasolutions.com> wrote:

> We're currently running Openstack Juno and are experiencing errors when
> performing large object copies between Hadoop HDFS and our Swift object
> store. While not using the Savana service directly, we are relying upon the
> Swift file system extension for Hadoop created as part of the Savana
> project.
>
> In each case, the large object copy (using Hadoop's distcp) results in
> Swift reporting an Error 413 - Request entity too large.
>
> As a test case, I created a 5.5 GB file of random data and tried to upload
> the file to Swift using Swift's CLI command. Once again Swift returned
> Error 413. If, however, I explicitly set a segment size on the Swift
> command line of 1G, then the file uploads correctly.
>
> When using Hadoop's distcp to move data from HDFS to Swift, the job always
> exists with Swift reporting Error 413. Explicitly setting the
> fs.swift.service.x.partsize does not appear to make any difference.
>
> My understanding is that Swift should automagically split files greater
> that 5G into multiple segments grouped under a metafile but this appears to
> not be working. This was working under the Havana release (Ubuntu) using
> the Swift File System jar file downloaded from the Marantis web site.  All
> current testing is based up the Juno release and when performing a distcp
> using the openstack-hadoop jar file shipped as part of the latest hadoop
> distros.
>
> Has anyone else seen this behavior?
>
> Thanks,
> /ross
>
> --
> Ross Lillie
> Application Software & Architecture Group
>
>
>


-- 
Ross Lillie
Application Software & Architecture Group

View my calendar
<https://www.google.com/calendar/embed?src=ross.lillie%40motorolasolutions.com&ctz=America/Chicago>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20150505/731648de/attachment.html>


More information about the Openstack mailing list