[Openstack] Ceph vs swift
Chuck Thier
cthier at gmail.com
Mon Jun 16 17:47:44 UTC 2014
Hi Vincenzo,
Thanks for taking the time to review my suggestions. I'm a bit concerned
though as you failed to address one of the biggest issues. Your test
results still indicate that the swift cluster wasn't properly configured
and you optimize the ceph cluster with an extra disk for journals, and thus
provides very misleading results.
--
Chuck
On Mon, Jun 16, 2014 at 4:03 AM, Vincenzo Pii <piiv at zhaw.ch> wrote:
> Hi Chuck,
>
> Many thanks for your comments!
> I have replied on the blog.
>
> Best regards,
> Vincenzo.
>
>
> 2014-06-12 21:10 GMT+02:00 Chuck Thier <cthier at gmail.com>:
>
>> Hi Vincenzo,
>>
>> First thank you for this work. It is always interesting to see
>> different data points from different use cases.
>>
>> I noticed a couple of things and would like to ask a couple of
>> questions and make some observations.
>>
>> Comparing a high level HTTP/REST based api (swift) to a low-level C
>> based api (librados) is not quite an apples to apples comparison. A more
>> interesting comparison, or at least another data point would be to run the
>> same tests with the rados gateway which provides a REST based interface. I
>> believe that what you attribute to differences between the performance of
>> the CRUSH algorithm and swift's ring are more likely attributed to the
>> extra overhead of a high level interface.
>>
>> An optimization is made for ceph by adding an extra disk to store the
>> journal, which specifically enhances the the small object performance, for
>> a total of 3 spindles per node. Yet, swift was only given 2 spindles per
>> node, thus giving ceph quite a substantial more overall IO to work with.
>>
>> In several of the results, the graphs show a significant reduction of
>> throughput with swift going from 1 container to 20 containers with the same
>> sized object. In a correctly configured swift cluster, performance for
>> PUTs at high concurrency will always be faster with more containers.
>> Container operations are not part of the GET path, so the number of
>> containers will not effect GET performance. This leads me to believe that
>> either swift isn't properly configured, or the client is doing something
>> non-optimal for those cases.
>>
>> The eventual consistency semantics of swift are reported incorrectly.
>> For 3 replicas, swift will stream out all 3 copies of the object to their
>> locations at the same time, and only return success if at least 2 of those
>> are successful. This is somewhat similar to the behavior of ceph.
>> Replication in swift is only used when there are failures.
>>
>> I would also suggest expanding the data set a bit. For example, test
>> the performance after the system has been filled more than 50%. I would
>> also highly recommend testing performance when there are failures, such as
>> a dead disk, or one of the nodes going away.
>>
>> Thanks,
>>
>> --
>> Chuck
>>
>>
>> On Thu, Jun 12, 2014 at 3:46 AM, Vincenzo Pii <piiv at zhaw.ch> wrote:
>>
>>> As promised, the results for our study on Ceph vs Swift for object
>>> storage:
>>> http://blog.zhaw.ch/icclab/evaluating-the-performance-of-ceph-and-swift-for-object-storage-on-small-clusters/
>>>
>>>
>>> 2014-06-06 20:19 GMT+02:00 Matthew Farrellee <matt at redhat.com>:
>>>
>>>> On 06/02/2014 02:52 PM, Chuck Thier wrote:
>>>>
>>>> I have heard that there has been some work to integrate Hadoop with
>>>>> Swift, but know very little about it. Integration with MS exchange,
>>>>> but
>>>>> could be an interesting use case.
>>>>>
>>>>
>>>> nutshell: the hadoop ecosystem tends to integrate with mapreduce or
>>>> hdfs. hdfs is the hadoop implementation of a distributed file system
>>>> interface. there are a handful of others -
>>>> http://wiki.apache.org/hadoop/HCFS - including one for swift. so where
>>>> you might access a file in via hdfs:///... you can also swift:///... for
>>>> processing by mapreduce or other frameworks in hadoop.
>>>>
>>>> best,
>>>>
>>>>
>>>> matt
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>>> openstack
>>>> Post to : openstack at lists.openstack.org
>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>>> openstack
>>>>
>>>
>>>
>>>
>>> --
>>> Vincenzo Pii
>>> Researcher, InIT Cloud Computing Lab
>>> Zurich University of Applied Sciences (ZHAW)
>>> http://www.cloudcomp.ch/
>>>
>>> _______________________________________________
>>> Mailing list:
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to : openstack at lists.openstack.org
>>> Unsubscribe :
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>
>>>
>>
>
>
> --
> Vincenzo Pii
> Researcher, InIT Cloud Computing Lab
> Zurich University of Applied Sciences (ZHAW)
> http://www.cloudcomp.ch/
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140616/e4b05236/attachment.html>
More information about the Openstack
mailing list