[openstack-dev] [gnocchi] new measures backlog scheduling

Julien Danjou julien at danjou.info
Tue Apr 18 09:21:02 UTC 2017


On Mon, Apr 17 2017, gordon chung wrote:

Hi Gordon,

> i've started to implement multiple buckets and the initial tests look 
> promising. here's some things i've done:
>
> - dropped the scheduler process and allow processing workers to figure 
> out tasks themselves
> - each sack is now handled fully (not counting anything added after 
> processing worker)
> - number of sacks are static
>
> after the above, i've been testing it and it works pretty well, i'm able 
> to process 40K metrics, 60 points each, in 8-10mins with 54 workers when 
> it took significantly longer before.

Great!

> the issues i've run into:
>
> - dynamic sack size
> making number of sacks dynamic is a concern. previously, we said to have 
> sack size in conf file. the concern is that changing that option 
> incorrectly actually 'corrupts' the db to a state that it cannot recover 
> from. it will have stray unprocessed measures constantly. if we change 
> the db path incorrectly, we don't actually corrupt anything, we just 
> lose data. we've said we don't want sack mappings in indexer so it seems 
> to me, the only safe solution is to make it sack size static and only 
> changeable by hacking?

Not hacking, just we need a proper tool to rebalance it.
As I already wrote, I think it's good enough to have this documented and
set to a moderated good value by default (e.g. 4096). There's no need to
store it in a configuration file, it should be stored in the storage
driver itself to avoid any mistake, when the storage is initialized via
`gnocchi-upgrade'.

> - sack distribution
> to distribute sacks across workers, i initially implemented consistent 
> hashing. the issue i noticed is that because hashring is inherently has 
> non-uniform distribution[1], i would have workers sitting idle because 
> it was given less sacks, while other workers were still working.
>
> i tried also to implement jump hash[2], which improved distribution and 
> is in theory, less memory intensive as it does not maintain a hash 
> table. while better at distribution, it still is not completely uniform 
> and similarly, the less sacks per worker, the worse the distribution.
>
> lastly, i tried just simple locking where each worker is completely 
> unaware of any other worker and handles all sacks. it will lock the sack 
> it is working on, so if another worker tries to work on it, it will just 
> skip. this will effectively cause an additional requirement on locking 
> system (in my case redis) as each worker will make x lock requests where 
> x is number of sacks. so if we have 50 workers and 2048 sacks, it will 
> be 102K requests per cycle. this is in addition to the n number of lock 
> requests per metric (10K-1M metrics?). this does guarantee if a worker 
> is free and there is work to be done, it will do it.
>
> i guess the question i have is: by using a non-uniform hash, it seems we 
> gain possibly less load at the expense of efficiency/'speed'. the number 
> of sacks/tasks we have is stable, it won't really change. the number of 
> metricd workers may change but not constantly. lastly, the number of 
> sacks per worker will always be relatively low (10:1, 100:1 assuming max 
> number of sacks is 2048). given these conditions, do we need 
> consistent/jump hashing? is it better to just modulo sacks and ensure 
> 'uniform' distribution and allow for 'larger' set of buckets to be 
> reshuffled when workers are added?

What about using the hashring with replicas (e.g. 3 by default) and a
lock per sack? This should reduce largely the number of lock try that
you see. If you have 2k sacks divided across 50 workers and each one has
a replica, that make each process care about 122 metrics so they might
send 122 acquire() try each, which is 50 × 122 = 6100 acquire request,
17 times less than 102k.
This also solve the problem of non-uniform distribution, as having
replicas make sure every node gets some work.

You can then probably remove the per-metric-lock too: this is just used
when processing new measures (here the sack lock is enough) and when
expunging metrics. You can safely use the same lock sack-lock for
expunging metric. We may just need to it out from janitor? Something to
think about!

Cheers,
-- 
Julien Danjou
-- Free Software hacker
-- https://julien.danjou.info
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 800 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170418/0d06d269/attachment.sig>


More information about the OpenStack-dev mailing list