[openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

Nejc Saje nsaje at redhat.com
Thu Sep 4 11:42:02 UTC 2014



On 09/04/2014 11:51 AM, Robert Collins wrote:
> On 4 September 2014 19:53, Nejc Saje <nsaje at redhat.com> wrote:
>
>> I used the terms that are used in the original caching use-case, as
>> described in [1] and are used in the pypi lib as well[2]. With the correct
>> approach, there aren't actually any partitions, 'replicas' actually denotes
>> the number of times you hash a node onto the ring. As for nodes&keys, what's
>> your suggestion?
>
> So - we should change the Ironic terms then, I suspect (but lets check
> with Deva who wrote the original code where he got them from).
>
> The parameters we need to create a ring are:
>   - how many fallback positions we use for data (currently referred to
> as replicas)
>   - how many times we hash the servers hosting data into the ring
> (currently inferred via the hash_partition_exponent / server count)
>   - the servers
>
> and then we probe data items as we go.
>
> The original paper isn't
> http://www.martinbroadhurst.com/Consistent-Hash-Ring.html -
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1879 is
> referenced by it, and that paper doesn't include the term replica
> count at all. In other systems like cassandra, replicas generally
> refers to how many servers end up holding a copy of the data: Martin
> Broadhurts paper uses replica there in quite a different sense - I
> much prefer the Ironic use, which says how many servers will be
> operating on the data: its externally relevant.

It doesn't contain that term precisely, but it does talk about 
replicating the buckets. What about using a descriptive name for this 
parameter, like 'distribution_quality', where the higher the value, 
higher the distribution evenness (and higher memory usage)?

>
> I've no objection talking about keys, but 'node' is an API object in
> Ironic, so I'd rather we talk about hosts - or make it something
> clearly not node like 'bucket' (which the 1997 paper talks about in
> describing consistent hash functions).
>
> So proposal:
>   - key - a stringifyable thing to be mapped to buckets
What about using the term 'item' from the original paper as well?

>   - bucket a worker/store that wants keys mapped to it
>   - replicas - number of buckets a single key wants to be mapped to
Can we keep this as an Ironic-internal parameter? Because it doesn't 
really affect the hash ring. If you want multiple buckets for your item, 
you just continue your journey along the ring and keep returning new 
buckets. Check out how the pypi lib does it: 
https://github.com/Doist/hash_ring/blob/master/hash_ring/ring.py#L119

>   - partitions - number of total divisions of the hash space (power of
> 2 required)
I don't think there are any divisions of the hash space in the correct 
implementation, are there? I think that in the current Ironic 
implementation this tweaks the distribution quality, just like 
'replicas' parameter in Ceilo implementation.

Cheers,
Nejc

>
>> I've opened a bug[3], so you can add a Closes-Bug to your patch.
>
> Thanks!
>
> -Rob
>
>



More information about the OpenStack-dev mailing list