[openstack-dev] Top Gate Bugs

Joe Gordon joe.gordon0 at gmail.com
Wed Dec 4 13:22:23 UTC 2013


TL;DR: Gate is failing 23% of the time due to bugs in nova, neutron and
tempest. We need help fixing these bugs.


Hi All,

Before going any further we have a bug that is affecting gate and stable,
so its getting top priority here. elastic-recheck currently doesn't track
unit tests because we don't expect them to fail very often. Turns out that
assessment was wrong, we now have a nova py27 unit test bug in gate and
stable gate.

https://bugs.launchpad.net/nova/+bug/1216851
Title: nova unit tests occasionally fail migration tests for mysql and
postgres
Hits
  FAILURE: 74
The failures appear multiple times for a single job, and some of those are
due to bad patches in the check queue.  But this is being seen in stable
and trunk gate so something is definitely wrong.

=======


Its time for another edition of of 'Top Gate Bugs.'  I am sending this out
now because in addition to our usual gate bugs a few new ones have cropped
up recently, and as we saw a few weeks ago it doesn't take very many new
bugs to wedge the gate.

Currently the gate has a failure rate of at least 23%! [0]

Note: this email was generated with
http://status.openstack.org/elastic-recheck/ and 'elastic-recheck-success'
[1]

1) https://bugs.launchpad.net/bugs/1253896
Title: test_minimum_basic_scenario fails with SSHException: Error reading
SSH protocol banner
Projects:  neutron, nova, tempest
Hits
  FAILURE: 324
This one has been around for several weeks now and although we have made
some attempts at fixing this, we aren't any closer at resolving this then
we were a few weeks ago.

2) https://bugs.launchpad.net/bugs/1251448
Title: BadRequest: Multiple possible networks found, use a Network ID to be
more specific.
Project: neutron
Hits
  FAILURE: 141

3) https://bugs.launchpad.net/bugs/1249065
Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py
Project: nova
Hits
  FAILURE: 112
This is a bug in nova's neutron code.

4) https://bugs.launchpad.net/bugs/1250168
Title: gate-tempest-devstack-vm-neutron-large-ops is failing
Projects: neutron, nova
Hits
  FAILURE: 94
This is an old bug that was fixed, but came back on December 3rd. So this
is a recent regression. This may be an infra issue.

5) https://bugs.launchpad.net/bugs/1210483
Title: ServerAddressesTestXML.test_list_server_addresses FAIL
Projects: neutron, nova
Hits
  FAILURE: 73
This has had some attempts made at fixing it but its still around.


In addition to the existing bugs, we have some new bugs on the rise:

1) https://bugs.launchpad.net/bugs/1257626
Title: Timeout while waiting on RPC response - topic: "network", RPC
method: "allocate_for_instance" info: "<unknown>"
Project: nova
Hits
  FAILURE: 52
large-ops only bug. This has been around for at least two weeks, but we
have seen this in higher numbers starting around December 3rd. This may  be
an infrastructure issue as the neutron-large-ops started failing more
around the same time.

2) https://bugs.launchpad.net/bugs/1257641
Title: Quota exceeded for instances: Requested 1, but already used 10 of 10
instances
Projects: nova, tempest
Hits
  FAILURE: 41
Like the previous bug, this has been around for at least two weeks but
appears to be on the rise.



Raw Data: http://paste.openstack.org/show/54419/


best,
Joe


[0] failure rate = 1-(success rate gate-tempest-dsvm-neutron)*(success rate
...) * ...

gate-tempest-dsvm-neutron = 0.00
gate-tempest-dsvm-neutron-large-ops = 11.11
gate-tempest-dsvm-full = 11.11
gate-tempest-dsvm-large-ops = 4.55
gate-tempest-dsvm-postgres-full = 10.00
gate-grenade-dsvm = 0.00

(I hope I got the math right here)

[1]
http://git.openstack.org/cgit/openstack-infra/elastic-recheck/tree/elastic_recheck/cmd/check_success.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131204/9ea7c07c/attachment.html>


More information about the OpenStack-dev mailing list