[openstack-dev] [tripleo][ironic][heat] Adding back the tripleo check job

Steven Hardy shardy at redhat.com
Tue Dec 1 11:22:22 UTC 2015


On Mon, Nov 30, 2015 at 03:35:13PM -0800, Devananda van der Veen wrote:
>    On Mon, Nov 30, 2015 at 3:07 PM, Zane Bitter <zbitter at redhat.com> wrote:
> 
>      On 30/11/15 12:51, Ruby Loo wrote:
> 
>        On 30 November 2015 at 10:19, Derek Higgins <derekh at redhat.com
>        <mailto:derekh at redhat.com>> wrote:
> 
>            Hi All,
> 
>                 A few months tripleo switch from its devtest based CI to
>        one
>            that was based on instack. Before doing this we anticipated
>            disruption in the ci jobs and removed them from non tripleo
>        projects.
> 
>                 We'd like to investigate adding it back to heat and
>        ironic as
>            these are the two projects where we find our ci provides the
>        most
>            value. But we can only do this if the results from the job are
>            treated as voting.
> 
>        What does this mean? That the tripleo job could vote and do a -1 and
>        block ironic's gate?
> 
>                 In the past most of the non tripleo projects tended to
>        ignore
>            the results from the tripleo job as it wasn't unusual for the
>        job to
>            broken for days at a time. The thing is, ignoring the results of
>        the
>            job is the reason (the majority of the time) it was broken in
>        the
>            first place.
>                 To decrease the number of breakages we are now no longer
>            running master code for everything (for the non tripleo projects
>        we
>            bump the versions we use periodically if they are working). I
>            believe with this model the CI jobs we run have become a lot
>        more
>            reliable, there are still breakages but far less frequently.
> 
>            What I proposing is we add at least one of our tripleo jobs back
>        to
>            both heat and ironic (and other projects associated with them
>        e.g.
>            clients, ironicinspector etc..), tripleo will switch to running
>            latest master of those repositories and the cores approving on
>        those
>            projects should wait for a passing CI jobs before hitting
>        approve.
>            So how do people feel about doing this? can we give it a go? A
>            couple of people have already expressed an interest in doing
>        this
>            but I'd like to make sure were all in agreement before switching
>        it on.
> 
>        This seems to indicate that the tripleo jobs are non-voting, or at
>        least
>        won't block the gate -- so I'm fine with adding tripleo jobs to
>        ironic.
>        But if you want cores to wait/make sure they pass, then shouldn't they
>        be voting? (Guess I'm a bit confused.)
> 
>      +1
> 
>      I don't think it hurts to turn it on, but tbh I'm uncomfortable with the
>      mental overhead of a non-voting job that I have to manually treat as a
>      voting job. If it's stable enough to make it a voting job, I'd prefer we
>      just make it voting. And if it's not then I'd like to see it be made
>      stable enough to be a voting job and then make it voting.
> 
>    This is roughly where I sit as well -- if it's non-voting, experience
>    tells me that it will largely be ignored, and as such, isn't a good use of
>    resources.

I'm sure you can appreciate it's something of a chicken/egg problem though
- if everyone always ignores non-voting jobs, they never become voting.

That effect is magnified with TripleO though, because it consumes so many
OpenStack projects, any one of which has the capability to break our CI, so
in an ideal world we'd have voting feedback on all-the-things, but that's
not where we are right now due in large-part to the steady stream of
regressions (from Heat, Ironic and other projects).

>    I haven't looked at tripleo or tripleoci in a while, so I wont assume that
>    my recollection of the CI jobs bears any resemblance to what exists today.
>    Could you explain what areas of ironic (or its subprojects) will be
>    covered by these tests?  If they are already covered by existing tests,
>    then I don't see the benefit of adding another job; conversely, if this is
>    testing areas we don't cover today, then there's probably value in running
>    tripleoci in a voting fashion for now and then moving that coverage into
>    ironic's project testing.

I like to think of TripleO as a trunk-chasing "power user", and as such
gives very valuable "user" feedback, including breaking things in exciting
ways you hadn't anticipated in your project integration tests.

This has, in the case of Heat at least, made TripleO an extremely effective
"kitchen sink" stress test, and has uncovered numerous issues we failed to
find with out internal tests (obviously we do add coverage when we find
them).

In the case of Ironic, I think the usage is somewhat less demanding, but no
less "real world" - here's a good example for you:

https://bugs.launchpad.net/ironic/+bug/1507738

In this case, Ironic landed a change to master, which broke all existing
deployments using Centos/RHEL derived distributions, so master Ironic has
been broken for folks using those distros for over 6 weeks.

I know in that case, the problem was really old ipxe image in the distro,
and yes there were several possible workarounds, but as a developer who
cares about users, I personally would rather get gate feedback than angry
users on IRC/email when I unwittingly break the world for them ;)

(note, I'm not assigning any blame above, it's one of *many* examples of
unexpected breakage due to insufficient gate feedback of real usage accross
many projects).

Cheers,

Steve



More information about the OpenStack-dev mailing list