Open Stack

Fri Jul 29 15:33:29 UTC 2016

On 29/07/2016 5:00 AM, Julien Danjou wrote:
> Best way is probably to do some bench… but I think it really depends on
> the use cases here. The interest of having many small splits is that you
> can parallelize the read.
>
> Considering the compression ratio we have, I think we should split in
> smaller files. I'd pick 3600 and give it a try.

i gave this a quick try with a series of ~68k points

with object size of 14400 points (uncompressed), i got:

[gchung at gchung-dev ~(keystone_admin)]$ time gnocchi measures show 
dc51c402-67e6-4b28-aba0-9d46b35b5397 --granularity 60 &> /tmp/blah

real	0m6.398s
user	0m5.003s
sys	0m0.071s

it took ~39.45s to process into 24 different aggregated series and 
created 6 split objects.

with object size of 3600 points (uncompressed), i got:

[gchung at gchung-dev ~(keystone_admin)]$ time gnocchi measures show 
301947fd-97ee-428a-b445-41a67ee62c38 --granularity 60 &> /tmp/blah

real	0m6.495s
user	0m4.970s
sys	0m0.073s

it took ~39.89s to process into 24 different aggregated series and 
created 21 split objects

so at first glance, it doesn't really seem to affect performance much 
whether it's one 'larger' file or many smaller files. that said, with 
new proposed v3 serialisation format, a larger file has a greater 
requirement for additional padding which is not a good thing.

cheers,

-- 
gord

Open Stack

[openstack-dev] [gnocchi] typical length of timeseries data

OpenStack

Community

Documentation

Branding & Legal