[Openstack] Ceph vs swift

Chuck Thier cthier at gmail.com
Thu Jun 12 19:10:38 UTC 2014


Hi Vincenzo,

First thank you for this work.  It is always interesting to see different
data points from different use cases.

I noticed a couple of things and would like to ask a couple of questions
and make some observations.

Comparing a high level HTTP/REST based api (swift) to a low-level C based
api (librados) is not quite an apples to apples comparison.  A more
interesting comparison, or at least another data point would be to run the
same tests with the rados gateway which provides a REST based interface.  I
believe that what you attribute to differences between the performance of
the CRUSH algorithm and swift's ring are more likely attributed to the
extra overhead of a high level interface.

An optimization is made for ceph by adding an extra disk to store the
journal, which specifically enhances the the small object performance, for
a total of 3 spindles per node.  Yet, swift was only given 2 spindles per
node, thus giving ceph quite a substantial more overall IO to work with.

In several of the results, the graphs show a significant reduction of
throughput with swift going from 1 container to 20 containers with the same
sized object.  In a correctly configured swift cluster, performance for
PUTs at high concurrency will always be faster with more containers.
 Container operations are not part of the GET path, so the number of
containers will not effect GET performance.  This leads me to believe that
either swift isn't properly configured, or the client is doing something
non-optimal for those cases.

The eventual consistency semantics of swift are reported incorrectly.  For
3 replicas, swift will stream out all 3 copies of the object to their
locations at the same time, and only return success if at least 2 of those
are successful.  This is somewhat similar to the behavior of ceph.
 Replication in swift is only used when there are failures.

I would also suggest expanding the data set a bit.  For example, test the
performance after the system has been filled more than 50%.  I would also
highly recommend testing performance when there are failures, such as a
dead disk, or one of the nodes going away.

Thanks,

--
Chuck


On Thu, Jun 12, 2014 at 3:46 AM, Vincenzo Pii <piiv at zhaw.ch> wrote:

> As promised, the results for our study on Ceph vs Swift for object
> storage:
> http://blog.zhaw.ch/icclab/evaluating-the-performance-of-ceph-and-swift-for-object-storage-on-small-clusters/
>
>
> 2014-06-06 20:19 GMT+02:00 Matthew Farrellee <matt at redhat.com>:
>
>> On 06/02/2014 02:52 PM, Chuck Thier wrote:
>>
>>  I have heard that there has been some work to integrate Hadoop with
>>> Swift, but know very little about it.  Integration with MS exchange, but
>>> could be an interesting use case.
>>>
>>
>> nutshell: the hadoop ecosystem tends to integrate with mapreduce or hdfs.
>> hdfs is the hadoop implementation of a distributed file system interface.
>> there are a handful of others - http://wiki.apache.org/hadoop/HCFS -
>> including one for swift. so where you might access a file in via
>> hdfs:///... you can also swift:///... for processing by mapreduce or other
>> frameworks in hadoop.
>>
>> best,
>>
>>
>> matt
>>
>>
>>
>> _______________________________________________
>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
>> openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
>> openstack
>>
>
>
>
> --
> Vincenzo Pii
> Researcher, InIT Cloud Computing Lab
> Zurich University of Applied Sciences (ZHAW)
> http://www.cloudcomp.ch/
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140612/1c9efe3e/attachment.html>


More information about the Openstack mailing list