[openstack-dev] Gate Status - Friday Edition

Salvatore Orlando sorlando at nicira.com
Fri Jan 24 23:57:51 UTC 2014


I've found out that several jobs are exhibiting failures like bug 1254890
[1] and bug 1253896 [2] because openvswitch seem to be crashing the kernel.
The kernel trace reports as offending process usually either
neutron-ns-metadata-proxy or dnsmasq, but [3] seem to clearly point to
ovs-vsctl.
254 events observed in the previous 6 days show a similar trace in the logs
[4].
This means that while this alone won't explain all the failures observed,
it is however potentially one of the prominent root causes.

>From the logs I have little hints about the kernel running. It seems there
has been no update in the past 7 days, but I can't be sure.
Openvswitch builds are updated periodically. The last build I found not to
trigger failures was the one generated on 2014/01/16 at 01:58:18.
Unfortunately version-wise I always have only 1.4.0, no build number.

I don't know if this will require getting in touch with ubuntu, or if we
can just prep a different image which an OVS build known to work without
problems.

Salvatore

[1] https://bugs.launchpad.net/neutron/+bug/1254890
[2] https://bugs.launchpad.net/neutron/+bug/1253896
[3] http://paste.openstack.org/show/61869/
[4] "kernel BUG at /build/buildd/linux-3.2.0/fs/buffer.c:2917" and
filename:syslog.txt


On 24 January 2014 21:13, Clay Gerrard <clay.gerrard at gmail.com> wrote:

> OH yeah that's much better.  I had found those eventually but had to dig
> through all that other stuff :'(
>
> Moving forward I think we can keep an eye on that page, open bugs for
> those tests causing issue and dig in.
>
> Thanks again!
>
> -Clay
>
>
> On Fri, Jan 24, 2014 at 11:37 AM, Sean Dague <sean at dague.net> wrote:
>
>> On 01/24/2014 02:02 PM, Peter Portante wrote:
>> > Hi Sean,
>> >
>> > In the last 7 days I see only 6 python27 based test
>> > failures:
>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNzogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk2Mjk0MDR9
>> >
>> > And 4 python26 based test
>> > failures:
>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNjogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk1MzAzNTd9
>> >
>> > Maybe the query you posted captures failures where the job did not even
>> run?
>> >
>> > And only 15 hits (well, 18, but three are within the same job, and some
>> > of the tests are run twice, so it is a combined of 10
>> > hits):
>> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRkFJTDpcIiBhbmQgbWVzc2FnZTpcInRlc3RcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM5MDU4OTg1NTAzMX0=
>> >
>> >
>> > Thanks,
>>
>> So it is true, that the Interupted exceptions (which is when a job is
>> killed because of a reset) are some times being turned into Fail events
>> by the system, which is one of the reasons the graphite data for
>> failures is incorrect, and if you use just the graphite sourcing for
>> fails, your numbers will be overly pessimistic.
>>
>> The following is probably better lists
>>  -
>>
>> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python26
>> (7 uncategorized fails)
>>  -
>>
>> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python27
>> (5 uncategorized fails)
>>
>>         -Sean
>>
>> --
>> Sean Dague
>> Samsung Research America
>> sean at dague.net / sean.dague at samsung.com
>> http://dague.net
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140125/27e20c80/attachment.html>


More information about the OpenStack-dev mailing list