[openstack-dev] flaky tempest -- Top Offenders
Sean Dague
sean at dague.net
Fri Sep 27 11:30:21 UTC 2013
On 09/26/2013 09:41 PM, Joe Gordon wrote:
> Hi All,
> As many of you may have suspected the gate has gotten less stable in the
> past few days. Turns out we have the numbers to prove it too!
> http://graphite.openstack.org/graphlot/?width=586&from=00%3A00_20130919&_salt=1380244287.508&height=308&target=summarize(stats_counts.zuul.pipeline.gate.job.gate-tempest-devstack-vm-neutron.FAILURE%2C%2224h%22)&target=summarize(stats_counts.zuul.pipeline.gate.job.gate-tempest-devstack-vm-neutron.SUCCESS%2C%2224h%22)&until=23%3A59_20130926&lineMode=staircase
> So tempest started failing more right around the 24th, even though we
> are in FeatureFreeze.
> "FF ensures that sufficient share of theReleaseCycle
> <https://wiki.openstack.org/wiki/ReleaseCycle>is dedicated to QA, until
> we produce the first release candidates. Limiting the changes that
> affect the behavior of the software allow for consistent testing and
> efficient bugfixing."
> https://wiki.openstack.org/wiki/FeatureFreeze
> Thanks to the work we have been doing with logstash and elastic-recheck
> we have very good numbers on the top offenders and when they began, the
> good news is there are two bugs which we are hitting the most, so the
> top offenders list has just two bugs. But there are still other unknown
> bugs and lower priority ones out there too!
> https://bugs.launchpad.net/tempest/+bug/1226337 -- Launchpad bug 1226337
> in tempest
> "tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern flake
> failure" [High,Triaged]
> Started on 9-23 with 408 hits! in the last 24 hours alone
> http://logstash.openstack.org/#eyJzZWFyY2giOiJAbWVzc2FnZTpcIk5vdmFFeGNlcHRpb246IGlTQ1NJIGRldmljZSBub3QgZm91bmQgYXRcIiBBTkQgQGZpZWxkcy5idWlsZF9zdGF0dXM6XCJGQUlMVVJFXCIgQU5EIEBmaWVsZHMuZmlsZW5hbWU6XCJsb2dzL3NjcmVlbi1uLWNwdS50eHRcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM4MDI0NDY2ODQ5Nn0=
> https://bugs.launchpad.net/tempest/+bug/1230407 -- Launchpad bug
> 1230407 in neutron "State change timeout exceeded" [Undecided,Confirmed]
> Started on 9-25 with 66 hits in the last 24 hours alone
> http://logstash.openstack.org/#eyJzZWFyY2giOiIgQG1lc3NhZ2U6XCJBc3NlcnRpb25FcnJvcjogU3RhdGUgY2hhbmdlIHRpbWVvdXQgZXhjZWVkZWQhXCIgQU5EIEBmaWVsZHMuYnVpbGRfc3RhdHVzOlwiRkFJTFVSRVwiIEFORCBAZmllbGRzLmZpbGVuYW1lOlwiY29uc29sZS5odG1sXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzODAyNDQ0MzM2NzZ9
This second one is looking like it's an issue with the Neutron DB layer,
as it seems to like to deadlock itself on agent updates -
So DB assistance would be good.
I've set that bug to Critical and RC1 for Neutron, because right now
it's bouncing at least 50% of the changes out of the gate (and as such
we're starving out the check queue for devstack nodes, so no changes
have made progress for 12 hrs over there).
Sean Dague
More information about the OpenStack-dev
mailing list