[Openstack] Enabling data deduplication on Swift

Joe Gordon jogo at cloudscaling.com
Sat Mar 10 17:57:03 UTC 2012


Paulo, Caitlin,


Can SHA-1 collisions be generated?  If so can you point me to the article?

Also why compare hashes in the first place?  Linux 'Kenel Samepage
Merging', which does page deduplication for KVM, does a full compare to be
safe [1].  Even if collisions can't be generated, what are the odds of a
collision (for SHA-1 and SHA-256) happening by chance when using Swift at
scale?


best,
Joe Gordon

> ****
>



[1] http://www.linux-kvm.com/sites/default/files/KvmForum2008_KSM.pdf


On Fri, Mar 9, 2012 at 4:44 PM, Caitlin Bestler <Caitlin.Bestler at nexenta.com
> wrote:

>  Paulo,****
>
> ** **
>
> I believe you’ll find that we’re thinking along the same lines. Please
> review my proposal at http://etherpad.openstack.org/P9MMYSWE6U****
>
> ** **
>
> One quick observation is that SHA-1 is totally inadequate for
> fingerprinting objects in a public object store. An attacker could easily*
> ***
>
> predict the fingerprint of content likely to be posted, generate alternate
> content that had the same SHA-1 fingerprint and pre-empt****
>
> the signature. For example: an ISO of an open source OS distribution. If I
> get my false content with the same fingerprint into the****
>
> repository first then everyone who downloads that ISO will get my altered
> copy.
>
****
>
> ** **
>
> SHA-256 is really needed to make this type of attack infeasible.
>
 **
>
> I also think that distributed deduplication works very well with object
> versioning. Your comments on the proposal cited above ****
>
> would be great to hear.****
>
> ** **
>
> *From:* openstack-bounces+caitlin.bestler=nexenta.com at lists.launchpad.net[mailto:
> openstack-bounces+caitlin.bestler=nexenta.com at lists.launchpad.net] *On
> Behalf Of *Paulo Ricardo Motta Gomes
> *Sent:* Thursday, March 08, 2012 1:19 PM
> *To:* openstack at lists.launchpad.net
>
> *Subject:* [Openstack] Enabling data deduplication on Swift****
>
> ** **
>
> Hello everyone,****
>
> ** **
>
> I'm a student of the European Master in Distributed Computing (EMDC)
> currently working on my master thesis on distributed content-addressable
> storage/deduplication.****
>
> ** **
>
> I'm happy to announce I will be contributing the outcome of my thesis work
> to OpenStack by enabling both object-level and block-level deduplication
> functionality on Swift (
> https://answers.launchpad.net/swift/+question/156862).****
>
> ** **
>
> I have written a detailed blog post where I describe the initial
> architecture of my solution:
> http://paulormg.com/2012/03/05/enabling-deduplication-in-a-distributed-object-storage/
> ****
>
> ** **
>
> Feedback from the OpenStack/Swift community would be very appreciated.****
>
> ** **
>
> Cheers,****
>
>  ****
>
> Paulo****
>
> ** **
>
> --
> European Master in Distributed Computing - www.kth.se/emdc
> Royal Institute of Technology - KTH****
>
> Instituto Superior Técnico - IST****
>
> http://paulormg.com****
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120310/12fc842d/attachment.html>


More information about the Openstack mailing list