[openstack-dev] Gate Status - Friday Edition

Joe Gordon joe.gordon0 at gmail.com
Sat Jan 25 00:07:57 UTC 2014


On Fri, Jan 24, 2014 at 6:57 PM, Salvatore Orlando <sorlando at nicira.com>wrote:

> I've found out that several jobs are exhibiting failures like bug 1254890
> [1] and bug 1253896 [2] because openvswitch seem to be crashing the kernel.
> The kernel trace reports as offending process usually either
> neutron-ns-metadata-proxy or dnsmasq, but [3] seem to clearly point to
> ovs-vsctl.
> 254 events observed in the previous 6 days show a similar trace in the
> logs [4].
> This means that while this alone won't explain all the failures observed,
> it is however potentially one of the prominent root causes.
>
> From the logs I have little hints about the kernel running. It seems there
> has been no update in the past 7 days, but I can't be sure.
> Openvswitch builds are updated periodically. The last build I found not to
> trigger failures was the one generated on 2014/01/16 at 01:58:18.
> Unfortunately version-wise I always have only 1.4.0, no build number.
>
> I don't know if this will require getting in touch with ubuntu, or if we
> can just prep a different image which an OVS build known to work without
> problems.
>
> Salvatore
>
> [1] https://bugs.launchpad.net/neutron/+bug/1254890
> [2] https://bugs.launchpad.net/neutron/+bug/1253896
> [3] http://paste.openstack.org/show/61869/
> [4] "kernel BUG at /build/buildd/linux-3.2.0/fs/buffer.c:2917" and
> filename:syslog.txt
>
>
Do you want to track this as a separate bug and e-r fingerprint? It will
overlap with the other two bugs but will give us good numbers on
status.openstack.org/elastic-recheck/


>
> On 24 January 2014 21:13, Clay Gerrard <clay.gerrard at gmail.com> wrote:
>
>> OH yeah that's much better.  I had found those eventually but had to dig
>> through all that other stuff :'(
>>
>> Moving forward I think we can keep an eye on that page, open bugs for
>> those tests causing issue and dig in.
>>
>> Thanks again!
>>
>> -Clay
>>
>>
>> On Fri, Jan 24, 2014 at 11:37 AM, Sean Dague <sean at dague.net> wrote:
>>
>>> On 01/24/2014 02:02 PM, Peter Portante wrote:
>>> > Hi Sean,
>>> >
>>> > In the last 7 days I see only 6 python27 based test
>>> > failures:
>>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNzogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk2Mjk0MDR9
>>> >
>>> > And 4 python26 based test
>>> > failures:
>>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNjogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk1MzAzNTd9
>>> >
>>> > Maybe the query you posted captures failures where the job did not
>>> even run?
>>> >
>>> > And only 15 hits (well, 18, but three are within the same job, and some
>>> > of the tests are run twice, so it is a combined of 10
>>> > hits):
>>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRkFJTDpcIiBhbmQgbWVzc2FnZTpcInRlc3RcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM5MDU4OTg1NTAzMX0=
>>> >
>>> >
>>> > Thanks,
>>>
>>> So it is true, that the Interupted exceptions (which is when a job is
>>> killed because of a reset) are some times being turned into Fail events
>>> by the system, which is one of the reasons the graphite data for
>>> failures is incorrect, and if you use just the graphite sourcing for
>>> fails, your numbers will be overly pessimistic.
>>>
>>> The following is probably better lists
>>>  -
>>>
>>> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python26
>>> (7 uncategorized fails)
>>>  -
>>>
>>> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python27
>>> (5 uncategorized fails)
>>>
>>>         -Sean
>>>
>>> --
>>> Sean Dague
>>> Samsung Research America
>>> sean at dague.net / sean.dague at samsung.com
>>> http://dague.net
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140124/db225b8a/attachment.html>


More information about the OpenStack-dev mailing list