[openstack-dev] [nova] Bogus -1 scores from turbo hipster

Samuel Merritt sam at swiftstack.com
Wed Jan 8 18:05:48 UTC 2014


On 1/7/14 2:53 PM, Michael Still wrote:
> Hi. Thanks for reaching out about this.
>
> It seems this patch has now passed turbo hipster, so I am going to
> treat this as a more theoretical question than perhaps you intended. I
> should note though that Joshua Hesketh and I have been trying to read
> / triage every turbo hipster failure, but that has been hard this week
> because we're both at a conference.
>
> The problem this patch faced is that we are having trouble defining
> what is a reasonable amount of time for a database migration to run
> for. Specifically:
>
> 2014-01-07 14:59:32,012 [output] 205 -> 206...
> 2014-01-07 14:59:32,848 [heartbeat]
> 2014-01-07 15:00:02,848 [heartbeat]
> 2014-01-07 15:00:32,849 [heartbeat]
> 2014-01-07 15:00:39,197 [output] done
>
> So applying migration 206 took slightly over a minute (67 seconds).
> Our historical data (mean + 2 standard deviations) says that this
> migration should take no more than 63 seconds. So this only just
> failed the test.

It seems to me that requiring a runtime less than (mean + 2 stddev) 
leads to a false-positive rate of 1 in 40, right? If the runtimes have a 
normal(-ish) distribution, then 95% of them will be within 2 standard 
deviations of the mean, so that's 1 in 20 falling outside that range. 
Then discard the ones that are faster than (mean - 2 stddev), and that 
leaves 1 in 40. Please correct me if I'm wrong; I'm no statistician.

Such a high false-positive may make it too easy to ignore turbo hipster 
as the bot that cried wolf. This problem already exists with Jenkins and 
the devstack/tempest tests; when one of those fails, I don't wonder what 
I broke, but rather how many times I'll have to recheck the patch until 
the tests pass.

Unfortunately, I don't have a solution to offer, but perhaps someone 
else will.



More information about the OpenStack-dev mailing list