[openstack-dev] [gnocchi] new measures backlog scheduling

gordon chung gord at live.ca
Tue Apr 18 12:26:22 UTC 2017



On 18/04/17 05:21 AM, Julien Danjou wrote:
>> - dynamic sack size
>> making number of sacks dynamic is a concern. previously, we said to have
>> sack size in conf file. the concern is that changing that option
>> incorrectly actually 'corrupts' the db to a state that it cannot recover
>> from. it will have stray unprocessed measures constantly. if we change
>> the db path incorrectly, we don't actually corrupt anything, we just
>> lose data. we've said we don't want sack mappings in indexer so it seems
>> to me, the only safe solution is to make it sack size static and only
>> changeable by hacking?
>
> Not hacking, just we need a proper tool to rebalance it.
> As I already wrote, I think it's good enough to have this documented and
> set to a moderated good value by default (e.g. 4096). There's no need to
> store it in a configuration file, it should be stored in the storage
> driver itself to avoid any mistake, when the storage is initialized via
> `gnocchi-upgrade'.

the issue i see is not with how the sacks will be assigned to metricd 
but how metrics (not daemon) are assigned to sacks. i don't think 
storing value in storage object solves the issue because when would we 
load/read it when the api and metricd processes startup? it seems this 
would require: 1) all services to be shut down and 2) have a completely 
clean incoming storage path. if any of the two steps aren't done, you 
have a corrupt incoming storage. if this is a requirement and both of 
these are done successfully, this means, any kind of 'live upgrade' is 
impossible in gnocchi.

>
>> - sack distribution
>> to distribute sacks across workers, i initially implemented consistent
>> hashing. the issue i noticed is that because hashring is inherently has
>> non-uniform distribution[1], i would have workers sitting idle because
>> it was given less sacks, while other workers were still working.
>>
>> i tried also to implement jump hash[2], which improved distribution and
>> is in theory, less memory intensive as it does not maintain a hash
>> table. while better at distribution, it still is not completely uniform
>> and similarly, the less sacks per worker, the worse the distribution.
>>
>> lastly, i tried just simple locking where each worker is completely
>> unaware of any other worker and handles all sacks. it will lock the sack
>> it is working on, so if another worker tries to work on it, it will just
>> skip. this will effectively cause an additional requirement on locking
>> system (in my case redis) as each worker will make x lock requests where
>> x is number of sacks. so if we have 50 workers and 2048 sacks, it will
>> be 102K requests per cycle. this is in addition to the n number of lock
>> requests per metric (10K-1M metrics?). this does guarantee if a worker
>> is free and there is work to be done, it will do it.
>>
>> i guess the question i have is: by using a non-uniform hash, it seems we
>> gain possibly less load at the expense of efficiency/'speed'. the number
>> of sacks/tasks we have is stable, it won't really change. the number of
>> metricd workers may change but not constantly. lastly, the number of
>> sacks per worker will always be relatively low (10:1, 100:1 assuming max
>> number of sacks is 2048). given these conditions, do we need
>> consistent/jump hashing? is it better to just modulo sacks and ensure
>> 'uniform' distribution and allow for 'larger' set of buckets to be
>> reshuffled when workers are added?
>
> What about using the hashring with replicas (e.g. 3 by default) and a
> lock per sack? This should reduce largely the number of lock try that
> you see. If you have 2k sacks divided across 50 workers and each one has
> a replica, that make each process care about 122 metrics so they might
> send 122 acquire() try each, which is 50 × 122 = 6100 acquire request,
> 17 times less than 102k.
> This also solve the problem of non-uniform distribution, as having
> replicas make sure every node gets some work.

i had did test w/ 2 replicas (see: google sheet) and it's still 
non-uniform but better than without replicas: ~4%-30% vs ~8%-45%. we 
could also minimise the number lock calls by dividing sacks across 
workers per agent.

going to play devils advocate now, using hashring in our use case will 
always hurt throughput (even with perfect distribution since the sack 
contents themselves are not uniform). returning to original question, is 
using hashring worth it? i don't think we're even leveraging the 
re-balancing aspect of hashring.

>
> You can then probably remove the per-metric-lock too: this is just used
> when processing new measures (here the sack lock is enough) and when
> expunging metrics. You can safely use the same lock sack-lock for
> expunging metric. We may just need to it out from janitor? Something to
> think about!
>

good point, we may not need to lock sack for expunging at all, since 
it's already marked as deleted in indexer so it is effectively not 
accessible.


-- 
gord


More information about the OpenStack-dev mailing list