[openstack-dev] [all] periodic jobs for master

David Kranz dkranz at redhat.com
Wed Oct 22 15:22:02 UTC 2014


On 10/22/2014 06:07 AM, Thierry Carrez wrote:
> Ihar Hrachyshka wrote:
>> [...]
>> For stable branches, we have so called periodic jobs that are
>> triggered once in a while against the current code in a stable branch,
>> and report to openstack-stable-maint@ mailing list. An example of
>> failing periodic job report can be found at [2]. I envision that
>> similar approach can be applied to test auxiliary features in gate. So
>> once something is broken in master, the interested parties behind the
>> auxiliary feature will be informed in due time.
>> [...]
> The main issue with periodic jobs is that since they are non-blocking,
> they can get ignored really easily. It takes a bit of organization and
> process to get those failures addressed.
>
> It's only recently (and a lot thanks to you) that failures in the
> periodic jobs for stable branches are being taken into account quickly
> and seriously. For years the failures just lingered until they blocked
> someone's work enough for that person to go and fix them.
>
> So while I think periodic jobs are a good way to increase corner case
> testing coverage, I am skeptical of our collective ability to have the
> discipline necessary for them not to become a pain. We'll need a strict
> process around them: identified groups of people signed up to act on
> failure, and failure stats so that we can remove jobs that don't get
> enough attention.
>
While I share some of your skepticism, we have to find a way to make 
this work.
Saying we are doing our best to ensure the quality of upstream OpenStack 
based on a single-tier of testing (the gate) that is limited to 40min runs
is not plausible. Of course a lot more testing happens downstream but we 
can do better as a community. I think we should rephrase this subject as 
"non-gating" jobs. We could have various kinds of stress and longevity 
jobs running to good effect if we can solve this process problem.

Following on your process suggestion, in practice the most likely way 
this could actually work is to have a rotation of "build guardians" that 
agree to keep an eye on jobs for a short period of time. There would 
need to be a separate rotation list for each project that has 
non-gating, project-specific jobs. This will likely happen as we move 
towards deeper functional testing in projects. The qa team would be the 
logical pool for a rotation of more global jobs of the kind I think Ihar 
was referring to.

As for failure status, each of these non-gating jobs would have their 
own name so logstash could be used to debug failures. Do we already have 
anything that tracks failure rates of jobs?

  -David






More information about the OpenStack-dev mailing list