[openstack-dev] [gnocchi] new measures backlog scheduling

Julien Danjou julien at danjou.info
Thu Mar 2 14:52:14 UTC 2017


On Thu, Mar 02 2017, gordon chung wrote:

Hi gordon,

> i was thinking more about this yesterday. i've an idea.

You should have seen my face when I read that! ;-P

> how we store new measures
> -------------------------
>
> when we add new measures to be processed, the metric itself is already 
> created in indexer, so it already has id. with the id, we can compute 
> and store a shard/bucket location in the indexer with the metric. since 
> metric id is an uuid, we can just mod it with number of buckets and it 
> should give us decent distribution. so with that, when we actually store 
> the new measure, we will look at the bucket location associated with the 
> metric.

Sounds good. What's interesting is how you implement a shard/bucket in
each driver. I imagine it's a container/bucket/directory.

> using hashring idea, the buckets will be distributed among all the 
> active metricd agents. the metricd agents will loop through all the 
> assigned buckets based on processing interval. the actual processing of 
> each bucket will be similar to what we have now: grab metrics, queue it 
> for processing workers. the only difference is instead of just grabbing 
> first x metrics and stopping, we keep grabbing until bucket is 'clear'. 
> this will help us avoid the current issue where some metrics are never 
> scheduled because the return order puts it at the end.

It does not change that much the current issue IIUC. The only difference
is that now we have 1 bucket and N metricd trying to empty it, whereas
now we would have M buckets with N metricd trying spread so there's M/N
metricd per bucket trying to empty each bucket. :)

At some scale (larger than currently) it will improve things but it does
not seem to be a drastic change.

(I am also not saying that I have a better solution :)

> we'll have a new agent (name here). this will walk through each metric 
> in our indexer, recompute a new bucket location, and set it. this will 
> make all new incoming points be pushed to new location. this agent will 
> also go to old location (if different) and process any unprocessed 
> measures of the metric. it will then move on to next metric until complete.
>
> there will probably need to be a state/config table or something so 
> indexer knows bucket size.
>
> i also think there might be a better partitioning technique to minimise 
> the number of metrics that change buckets... need to think about that more.

Yes, it's called consistent hashing, and that's what Swift and the like
are using.

Basically the idea is to create A LOT of buckets (higher than
your maximum number of potential metricd), let's say, 2^16, and then
distribute those containers across your metricds, e.g. if you have 10
metrics they will each be responsible for 6 554 buckets, when a 11th
metricd comes up, you just have to recompute whose responsible for which
bucket. This is exactly what tooz new partitioner system provide and
that we can leverage easily:

  https://github.com/openstack/tooz/blob/master/tooz/partitioner.py#L25

All we have to do is create a lot of buckets and ask tooz which buckets
belongs to each metricd. And then make them poll over and over again
(sigh) to empty them.

This make sure you DON'T have to rebalance your buckets like you
proposed earlier, which is costly, long and painful.

-- 
Julien Danjou
// Free Software hacker
// https://julien.danjou.info
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 800 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170302/8c4a9050/attachment.pgp>


More information about the OpenStack-dev mailing list