[openstack-dev] Current biggest OpenStack gate fail culprit - neutron bug #1194026

Sean Dague sean at dague.net
Thu Jul 11 15:54:20 UTC 2013


On 07/11/2013 11:33 AM, Matthew Treinish wrote:
> On Thu, Jul 11, 2013 at 08:16:26AM -0700, Dan Smith wrote:
>>> In the corner to my left, our current largest gate reset culprit
>>> appears to be neutron bug #1194026 - weighing in with 62 rechecks
>>> since June 24th (http://status.openstack.org/rechecks/)
>>
>> So, with some of the highest rates of patch traffic we've seen over the
>> last couple of weeks before the H2 deadline, I think this is really
>> becoming a problem. I think merge times are through the roof as a
>> result.
>>
>> Since the neutron gate is not a full tempest run, I think we should
>> consider making a temporary change. I know that turning it into a
>> non-voting job is not a popular solution, and I hate to even suggest
>> it. However, it's just a subset of the tests anyway and I think the
>> impact is currently overshadowing the potential for regression
>> detection, given the relatively small amount of coverage. Is this
>> something people would consider?
>
> I don't think this is the way to go. Even though it's limited coverage
> without it Neutron would have no gating integrated testing run on it at all.
> In my experience this will just cause more difficulty down the road when
> we decide to switch it back to voting. Things tend to bit rot fairly quickly.
>
>>
>> Of course, the other option is to try to skip the offending test if
>> we're running with neutron support, which may help. Since we don't know
>> what the problem is and it *seems* to be an issue with resources not
>> becoming available before a timeout (AIUI), I worry that this will just
>> move the problem elsewhere.
>
> So if it is a single test (or set of tests) failing then this is doable. We
> can do this in the short term, but if it just moves the problem elsewhere then
> we're just in the same situation right? So what's the harm in trying this?

Let's start with the test skip.

I am however pretty frustrated that we're really not getting anyone from 
neutron looking at this. We're at 121 rechecks (plus I'm sure there were 
plenty of no bug rechecks, I've seen a couple). So 150+ gate resets 
because of this bug. Which is 150hrs worth of delay put into the gate.

	-Sean

-- 
Sean Dague
http://dague.net



More information about the OpenStack-dev mailing list