[openstack-dev] [gnocchi] new measures backlog scheduling
gordon chung
gord at live.ca
Thu Mar 2 14:31:15 UTC 2017
On 15/11/16 04:53 AM, Julien Danjou wrote:
> Yeah in the case of the Swift driver for Gnocchi, I'm not really sure
> how much buckets we should create. Should we make the user pick a random
> number like the number of partition in Swift and then create the
> containers in Swift? Or can we have something simpler? (I like automagic
> things). WDYT Gordon?
i was thinking more about this yesterday. i've an idea.
how we store new measures
-------------------------
when we add new measures to be processed, the metric itself is already
created in indexer, so it already has id. with the id, we can compute
and store a shard/bucket location in the indexer with the metric. since
metric id is an uuid, we can just mod it with number of buckets and it
should give us decent distribution. so with that, when we actually store
the new measure, we will look at the bucket location associated with the
metric.
how we process measures
-----------------------
using hashring idea, the buckets will be distributed among all the
active metricd agents. the metricd agents will loop through all the
assigned buckets based on processing interval. the actual processing of
each bucket will be similar to what we have now: grab metrics, queue it
for processing workers. the only difference is instead of just grabbing
first x metrics and stopping, we keep grabbing until bucket is 'clear'.
this will help us avoid the current issue where some metrics are never
scheduled because the return order puts it at the end.
how we change bucket size
-------------------------
we'll have a new agent (name here). this will walk through each metric
in our indexer, recompute a new bucket location, and set it. this will
make all new incoming points be pushed to new location. this agent will
also go to old location (if different) and process any unprocessed
measures of the metric. it will then move on to next metric until complete.
there will probably need to be a state/config table or something so
indexer knows bucket size.
i also think there might be a better partitioning technique to minimise
the number of metrics that change buckets... need to think about that more.
what we set default bucket size to
----------------------------------
32? say we aim for default 10K metrics, that puts ~310 metrics (and its
measure objects from POST) in each bucket... or 64?
cheers,
--
gord
More information about the OpenStack-dev
mailing list