[Openstack] Ceph vs swift

Vincenzo Pii piiv at zhaw.ch
Mon Jun 16 09:03:39 UTC 2014


Hi Chuck,

Many thanks for your comments!
I have replied on the blog.

Best regards,
Vincenzo.


2014-06-12 21:10 GMT+02:00 Chuck Thier <cthier at gmail.com>:

>  Hi Vincenzo,
>
>  First thank you for this work.  It is always interesting to see
> different data points from different use cases.
>
>  I noticed a couple of things and would like to ask a couple of questions
> and make some observations.
>
>  Comparing a high level HTTP/REST based api (swift) to a low-level C
> based api (librados) is not quite an apples to apples comparison.  A more
> interesting comparison, or at least another data point would be to run the
> same tests with the rados gateway which provides a REST based interface.  I
> believe that what you attribute to differences between the performance of
> the CRUSH algorithm and swift's ring are more likely attributed to the
> extra overhead of a high level interface.
>
>  An optimization is made for ceph by adding an extra disk to store the
> journal, which specifically enhances the the small object performance, for
> a total of 3 spindles per node.  Yet, swift was only given 2 spindles per
> node, thus giving ceph quite a substantial more overall IO to work with.
>
>  In several of the results, the graphs show a significant reduction of
> throughput with swift going from 1 container to 20 containers with the same
> sized object.  In a correctly configured swift cluster, performance for
> PUTs at high concurrency will always be faster with more containers.
>  Container operations are not part of the GET path, so the number of
> containers will not effect GET performance.  This leads me to believe that
> either swift isn't properly configured, or the client is doing something
> non-optimal for those cases.
>
>  The eventual consistency semantics of swift are reported incorrectly.
>  For 3 replicas, swift will stream out all 3 copies of the object to their
> locations at the same time, and only return success if at least 2 of those
> are successful.  This is somewhat similar to the behavior of ceph.
>  Replication in swift is only used when there are failures.
>
>  I would also suggest expanding the data set a bit.  For example, test
> the performance after the system has been filled more than 50%.  I would
> also highly recommend testing performance when there are failures, such as
> a dead disk, or one of the nodes going away.
>
>  Thanks,
>
>  --
> Chuck
>
>
> On Thu, Jun 12, 2014 at 3:46 AM, Vincenzo Pii <piiv at zhaw.ch> wrote:
>
>> As promised, the results for our study on Ceph vs Swift for object
>> storage:
>> http://blog.zhaw.ch/icclab/evaluating-the-performance-of-ceph-and-swift-for-object-storage-on-small-clusters/
>>
>>
>> 2014-06-06 20:19 GMT+02:00 Matthew Farrellee <matt at redhat.com>:
>>
>>> On 06/02/2014 02:52 PM, Chuck Thier wrote:
>>>
>>>  I have heard that there has been some work to integrate Hadoop with
>>>> Swift, but know very little about it.  Integration with MS exchange, but
>>>> could be an interesting use case.
>>>>
>>>
>>>  nutshell: the hadoop ecosystem tends to integrate with mapreduce or
>>> hdfs. hdfs is the hadoop implementation of a distributed file system
>>> interface. there are a handful of others -
>>> http://wiki.apache.org/hadoop/HCFS - including one for swift. so where
>>> you might access a file in via hdfs:///... you can also swift:///... for
>>> processing by mapreduce or other frameworks in hadoop.
>>>
>>> best,
>>>
>>>
>>> matt
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>> openstack
>>> Post to     : openstack at lists.openstack.org
>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>> openstack
>>>
>>
>>
>>
>>  --
>>  Vincenzo Pii
>>  Researcher, InIT Cloud Computing Lab
>> Zurich University of Applied Sciences (ZHAW)
>> http://www.cloudcomp.ch/
>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>>
>


-- 
Vincenzo Pii
Researcher, InIT Cloud Computing Lab
Zurich University of Applied Sciences (ZHAW)
http://www.cloudcomp.ch/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140616/2f4b77bd/attachment.html>


More information about the Openstack mailing list