[openstack-dev] [neutron][infra] Functional job failure rate at 100%

Jakub Libosvar jlibosva at redhat.com
Wed Aug 9 13:29:04 UTC 2017


Daniel Alvarez and I spent some time looking at it and the culprit was
finally found.

tl;dr

We updated a kernel on machines to one containing bug when creating
conntrack entries which makes functional tests stuck. More info at [4].

For now, I sent a patch [5] to disable for now jobs that create
conntrack entries manually, it needs update of commit message. Once it
merges, we an enable back functional job to voting to avoid regressions.

Is it possible to switch used image for jenkins machines to use back the
older version? Any other ideas how to deal with the kernel bug?

Thanks
Jakub

[5] https://review.openstack.org/#/c/492068/1

On 07/08/2017 11:52, Jakub Libosvar wrote:
> Hi all,
> 
> as per grafana [1] the functional job is broken. Looking at logstash [2]
> it started happening consistently since 2017-08-03 16:27. I didn't find
> any particular patch in Neutron that could cause it.
> 
> The culprit is that ovsdb starts misbehaving [3] and then we retry calls
> indefinitely. We still use 2.5.2 openvswitch as we had before. I opened
> a bug [4] and started investigation, I'll update my findings there.
> 
> I think at this point there is no reason to run "recheck" on your patches.
> 
> Thanks,
> Jakub
> 
> [1]
> http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=7&fullscreen
> [2] http://bit.ly/2vdKMwy
> [3]
> http://logs.openstack.org/14/488914/8/check/gate-neutron-dsvm-functional-ubuntu-xenial/75d7482/logs/openvswitch/ovsdb-server.txt.gz
> [4] https://bugs.launchpad.net/neutron/+bug/1709032
> 




More information about the OpenStack-dev mailing list