[openstack-dev] Top Gate Bugs

Christopher Yeoh cbkyeoh at gmail.com
Fri Nov 22 03:13:45 UTC 2013


On Fri, Nov 22, 2013 at 2:28 AM, Matt Riedemann
<mriedem at linux.vnet.ibm.com>wrote:

>
>
> On Wednesday, November 20, 2013 11:53:45 PM, Clark Boylan wrote:
>
>> On Wed, Nov 20, 2013 at 9:43 PM, Ken'ichi Ohmichi <ken1ohmichi at gmail.com>
>> wrote:
>>
>>> Hi Joe,
>>>
>>> 2013/11/20 Joe Gordon <joe.gordon0 at gmail.com>:
>>>
>>>> Hi All,
>>>>
>>>> As many of you have noticed the gate has been in very bad shape over the
>>>> past few days.  Here is a list of some of the top open bugs (without
>>>> pending
>>>> patches, and many recent hits) that we are hitting.  Gate won't be
>>>> stable,
>>>> and it will be hard to get your code merged, until we fix these bugs.
>>>>
>>>> 1) https://bugs.launchpad.net/bugs/1251920
>>>> nova
>>>> 468 Hits
>>>>
>>>
>>> Can we know the frequency of each failure?
>>> I'm trying 1251920 and putting the investigation tempest patch.
>>>   https://review.openstack.org/#/c/57193/
>>>
>>> The patch can avoid this problem 4 times, but I am not sure this is
>>> worth or not.
>>>
>>>
>>> Thanks
>>> Ken'ichi Ohmichi
>>>
>>> ---
>>>
>>>> 2) https://bugs.launchpad.net/bugs/1251784
>>>> neutron, Nova
>>>> 328 Hits
>>>> 3) https://bugs.launchpad.net/bugs/1249065
>>>> neutron
>>>>    122 hits
>>>> 4) https://bugs.launchpad.net/bugs/1251448
>>>> neutron
>>>> 65 Hits
>>>>
>>>> Raw Data:
>>>>
>>>>
>>>> Note: If a bug has any hits for anything besides failure, it means the
>>>> fingerprint isn't perfect.
>>>>
>>>> Elastic recheck known issues
>>>> Bug: https://bugs.launchpad.net/bugs/1251920 =>
>>>> message:"assertionerror:
>>>> console output was empty" AND filename:"console.html" Title: Tempest
>>>> failures due to failure to return console logs from an instance Project:
>>>> Status nova: Confirmed Hits FAILURE: 468 Bug:
>>>> https://bugs.launchpad.net/bugs/1251784 => message:"Connection to
>>>> neutron
>>>> failed: Maximum attempts reached" AND filename:"logs/screen-n-cpu.txt"
>>>> Title: nova+neutron scheduling error: Connection to neutron failed:
>>>> Maximum
>>>> attempts reached Project: Status neutron: New nova: New Hits FAILURE:
>>>> 328
>>>> UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256=>
>>>> message:" 503" AND filename:"logs/syslog.txt" AND
>>>> syslog_program:"proxy-server" Title: swift proxy-server returning 503
>>>> during
>>>> tempest run Project: Status openstack-ci: Incomplete swift: New
>>>> tempest: New
>>>> Hits FAILURE: 136 SUCCESS: 83
>>>> Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 =>
>>>> message:"No
>>>> nw_info cache associated with instance" AND filename:"logs/screen-n-api.
>>>> txt"
>>>> Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py
>>>> Project:
>>>> Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug:
>>>> https://bugs.launchpad.net/bugs/1252514 => message:"Got error from
>>>> Swift:
>>>> put_object" AND filename:"logs/screen-g-api.txt" Title: glance doesn't
>>>> recover if Swift returns an error Project: Status devstack: New glance:
>>>> New
>>>> swift: New Hits FAILURE: 95
>>>> Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 =>
>>>> message:"NovaException: Unexpected vif_type=binding_failed" AND
>>>> filename:"logs/screen-n-cpu.txt" Title: binding_failed because of l2
>>>> agent
>>>> assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92
>>>> SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 => message:"
>>>> possible networks found, use a Network ID to be more specific. (HTTP
>>>> 400)"
>>>> AND filename:"console.html" Title: BadRequest: Multiple possible
>>>> networks
>>>> found, use a Network ID to be more specific. Project: Status neutron:
>>>> New
>>>> Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 =>
>>>> message:"tempest/services" AND message:"/images_client.py" AND
>>>> message:"wait_for_image_status" AND filename:"console.html" Title:
>>>> "TimeoutException: Request timed out" on
>>>> tempest.api.compute.images.test_list_image_filters.
>>>> ListImageFiltersTestXML
>>>> Project: Status glance: New Hits FAILURE: 62 Bug:
>>>> https://bugs.launchpad.net/bugs/1235435 => message:"One or more ports
>>>> have
>>>> an IP allocation from this subnet" AND message:" SubnetInUse: Unable to
>>>> complete operation on subnet" AND filename:"logs/screen-q-svc.txt"
>>>> Title:
>>>> 'SubnetInUse: Unable to complete operation on subnet UUID. One or more
>>>> ports
>>>> have an IP allocation from this subnet.' Project: Status neutron:
>>>> Incomplete
>>>> nova: Fix Committed tempest: New Hits FAILURE: 48 Bug:
>>>> https://bugs.launchpad.net/bugs/1224001 =>
>>>> message:"tempest.scenario.test_network_basic_ops AssertionError: Timed
>>>> out
>>>> waiting for" AND filename:"console.html" Title: test_network_basic_ops
>>>> fails
>>>> waiting for network to become available Project: Status neutron: In
>>>> Progress
>>>> swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug:
>>>> https://bugs.launchpad.net/bugs/1218391 => message:"Cannot
>>>> 'createImage'"
>>>> AND filename:"console.html" Title:
>>>> tempest.api.compute.images.test_images_oneserver.
>>>> ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active
>>>> spurious failure Project: Status nova: Confirmed swift: Confirmed
>>>> tempest:
>>>> Confirmed Hits FAILURE: 25
>>>>
>>>>
>>>>
>>>> best,
>>>> Joe Gordon
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>> Joe seemed to be on the same track with
>> https://review.openstack.org/#/q/status:open+project:
>> openstack/tempest+branch:master+topic:57578,n,z
>> but went far enough to revert the change that introduced that test. A
>> couple people were going to keep hitting those changes to run them
>> through more tests and see if 1251920 goes away.
>>
>> I don't quite understand why this test is problematic (Joe indicated
>> it went in at about the time 1251920 became a problem). I would be
>> very interested in finding out why this caused a problem.
>>
>> You can see frequencies for bugs with known signatures at
>> http://status.openstack.org/elastic-recheck/
>>
>> Hope this helps.
>>
>> Clark
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
> Joe is tracking some notes in an etherpad here:
>
> https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013
>
> I've added https://review.openstack.org/#/c/57069/ and
> https://review.openstack.org/#/c/57042/ to the list.
>
>
That has been really useful. I think having a known page we can go to when
the gate
gets in really poor shape would be very handy (or maybe just put it in the
various irc channel topics).

Also does it make sense to ask people to stop doing rechecks as well when
the gate is stuck to
allow those who are putting through debugging patches quicker feedback?

Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131122/ce7d1ea2/attachment.html>


More information about the OpenStack-dev mailing list