[openstack-dev] [gate] large-ops failure spike
tony at bakeyournoodle.com
Wed Jan 20 19:08:03 UTC 2016
On Wed, Jan 20, 2016 at 07:45:16AM -0500, Sean Dague wrote:
> The large-ops jobs jumped to a 50% fail in check, 25% fail in gate in
> the last 24 hours.
> There isn't an obvious culprit at this point. I spent some time this
> morning digging into it a bit. Possibly each individual instance build
> got slower, possibly some other timeout is getting hit.
> The large-ops jobs were largely maintained by Joe Gordon, who dug into
> them when there were issues. He's not part of the community any more,
> and I don't think there is currently a point person.
> With no current maintainer, I'd suggest we make the jobs non voting -
I think that non-voting makes sense in the short term.
> I also suggest their time has probably come and gone. There is no one
> active on them, and the Rally team is.
> A pre-gating test job is only useful if someone is actively addressing
> systematic fails. This job class no longer has it. We should thus retire it.
If this still adds value (and I think it does) then I think we should try hard
to keep this job.
(once the gate gets back to normal) 25hours to gate is nuts.
Yes I'm volunteering to climb under that bus.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 473 bytes
Desc: not available
More information about the OpenStack-dev