[openstack-dev] [Swift] (Non-)consistency of the Swift hash ring implementation

John Dickinson me at not.mn
Mon Sep 8 03:39:10 UTC 2014


To test Swift directly, I used the CLI tools that Swift provides for managing rings. I wrote the following short script:

$ cat remakerings
#!/bin/bash

swift-ring-builder object.builder create 16 3 0
for zone in {1..4}; do
for server in {200..224}; do
for drive in {1..12}; do
swift-ring-builder object.builder add r1z${zone}-10.0.${zone}.${server}:6010/d${drive} 3000
done
done
done
swift-ring-builder object.builder rebalance



This adds 1200 devices. 4 zones, each with 25 servers, each with 12 drives (4*25*12=1200). The important thing is that instead of adding 1000 drives in one zone or in one server, I'm splaying across the placement hierarchy that Swift uses.

After running the script, I added one drive to one server to see what the impact would be and rebalanced. The swift-ring-builder tool detected that less than 1% of the partitions would change and therefore didn't move anything (just to avoid unnecessary data movement).

--John





On Sep 7, 2014, at 11:20 AM, Nejc Saje <nsaje at redhat.com> wrote:

> Hey guys,
> 
> in Ceilometer we're using consistent hash rings to do workload
> partitioning[1]. We've considered using Ironic's hash ring implementation, but found out it wasn't actually consistent (ML[2], patch[3]). The next thing I noticed that the Ironic implementation is based on Swift's.
> 
> The gist of it is: since you divide your ring into a number of equal sized partitions, instead of hashing hosts onto the ring, when you add a new host, an unbound amount of keys get re-mapped to different hosts (instead of the 1/#nodes remapping guaranteed by hash ring).
> 
> Swift's hash ring implementation is quite complex though, so I took the conceptually similar code from Gregory Holt's blogpost[4] (which I'm guessing is based on Gregory's efforts on Swift's hash ring implementation) and tested that instead. With a simple test (paste[5]) of first having 1000 nodes and then adding 1, 99.91% of the data was moved.
> 
> I have no way to test this in Swift directly, so I'm just throwing this out there, so you guys can figure out whether there actually is a problem or not.
> 
> Cheers,
> Nejc
> 
> [1] https://review.openstack.org/#/c/113549/
> [2] http://lists.openstack.org/pipermail/openstack-dev/2014-September/044566.html
> [3] https://review.openstack.org/#/c/118932/4
> [4] http://greg.brim.net/page/building_a_consistent_hashing_ring.html
> [5] http://paste.openstack.org/show/107782/
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140907/390268cd/attachment.pgp>


More information about the OpenStack-dev mailing list