[openstack-dev] [gnocchi] new measures backlog scheduling

gordon chung gord at live.ca
Thu Mar 2 14:31:15 UTC 2017



On 15/11/16 04:53 AM, Julien Danjou wrote:
> Yeah in the case of the Swift driver for Gnocchi, I'm not really sure
> how much buckets we should create. Should we make the user pick a random
> number like the number of partition in Swift and then create the
> containers in Swift? Or can we have something simpler? (I like automagic
> things). WDYT Gordon?


i was thinking more about this yesterday. i've an idea.


how we store new measures
-------------------------

when we add new measures to be processed, the metric itself is already 
created in indexer, so it already has id. with the id, we can compute 
and store a shard/bucket location in the indexer with the metric. since 
metric id is an uuid, we can just mod it with number of buckets and it 
should give us decent distribution. so with that, when we actually store 
the new measure, we will look at the bucket location associated with the 
metric.


how we process measures
-----------------------

using hashring idea, the buckets will be distributed among all the 
active metricd agents. the metricd agents will loop through all the 
assigned buckets based on processing interval. the actual processing of 
each bucket will be similar to what we have now: grab metrics, queue it 
for processing workers. the only difference is instead of just grabbing 
first x metrics and stopping, we keep grabbing until bucket is 'clear'. 
this will help us avoid the current issue where some metrics are never 
scheduled because the return order puts it at the end.


how we change bucket size
-------------------------

we'll have a new agent (name here). this will walk through each metric 
in our indexer, recompute a new bucket location, and set it. this will 
make all new incoming points be pushed to new location. this agent will 
also go to old location (if different) and process any unprocessed 
measures of the metric. it will then move on to next metric until complete.

there will probably need to be a state/config table or something so 
indexer knows bucket size.

i also think there might be a better partitioning technique to minimise 
the number of metrics that change buckets... need to think about that more.


what we set default bucket size to
----------------------------------

32? say we aim for default 10K metrics, that puts ~310 metrics (and its 
measure objects from POST) in each bucket... or 64?


cheers,

-- 
gord



More information about the OpenStack-dev mailing list