[Openstack] Swift block-level deduplication

Eoghan Glynn eglynn at redhat.com
Thu Apr 12 16:09:25 UTC 2012



Folks,

>From previous posts on the ML, it seems there are a couple of
efforts in train to add distributed content deduping to Swift.

My question is whether either or both these approaches involve
active client participation in enabling duplicate chunk
detection?

One could see a spectrum ranging between:

1. Client actively breaks the object into chunks, selects the
   hashing algorithm, calculates fingerprint and then only uploads
   if Swift reports that fingerprint is unknown.

2. Client determines which objects are worth deduping, maybe has
   some influence on chunk size and/or hashing, but fingerprint
   calculation is all handled internally by Swift.

3. Client is entirely uninvolved, deduplication is handled
   transparently in the object storage layer and enabled either
   globally or per-container.

If anyone involved has insight into the above, I'd be interested
in hearing your thoughts (the context is leveraging dedupe in glance).

Cheers,
Eoghan




More information about the Openstack mailing list