[openstack-dev] [nova] Bogus -1 scores from turbo hipster

Robert Collins robertc at robertcollins.net
Fri Jan 10 08:42:44 UTC 2014


On 9 January 2014 07:05, Samuel Merritt <sam at swiftstack.com> wrote:
> On 1/7/14 2:53 PM, Michael Still wrote:

>> So applying migration 206 took slightly over a minute (67 seconds).
>> Our historical data (mean + 2 standard deviations) says that this
>> migration should take no more than 63 seconds. So this only just
>> failed the test.
>
>
> It seems to me that requiring a runtime less than (mean + 2 stddev) leads to
> a false-positive rate of 1 in 40, right? If the runtimes have a normal(-ish)
> distribution, then 95% of them will be within 2 standard deviations of the
> mean, so that's 1 in 20 falling outside that range. Then discard the ones
> that are faster than (mean - 2 stddev), and that leaves 1 in 40. Please
> correct me if I'm wrong; I'm no statistician.

Your math is right but performance distribution isn't necessarily
standard - there's some minimum time the operation takes (call this
the ideal time) and then there are things like having to do I/O which
make it worse - so if you're testing on idle systems, most of the time
you're near ideal, and then sometimes you're worse - but you're never
better.

The acid question is whether the things that make the time worse are
things we should consider in evaluating the time. For instance, I/O
contention with other VMs - ignore. Database engines deciding to do
garbage collection at just the wrong time - probably we want to
consider that, because that is something prod systems may encounter
(or we should put a gc in the deploy process and test it etc).

I think we should set some confidence interval - e.g. 95% - and then
from that we can calculate how many runs we need to be confident it
won't occur more than that often. The number of runs will be more than
3 though :).

-Rob


-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list