[openstack-dev] [tripleo][ironic][heat] Adding back the tripleo check job
Steven Hardy
shardy at redhat.com
Wed Dec 2 12:53:53 UTC 2015
On Tue, Dec 01, 2015 at 05:10:57PM -0800, Devananda van der Veen wrote:
> On Tue, Dec 1, 2015 at 3:22 AM, Steven Hardy <shardy at redhat.com> wrote:
>
> On Mon, Nov 30, 2015 at 03:35:13PM -0800, Devananda van der Veen wrote:
> >Â Â On Mon, Nov 30, 2015 at 3:07 PM, Zane Bitter <zbitter at redhat.com>
> wrote:
> >
> >Â Â Â On 30/11/15 12:51, Ruby Loo wrote:
> >
> >Â Â Â Â On 30 November 2015 at 10:19, Derek Higgins
> <derekh at redhat.com
> >Â Â Â Â <mailto:derekh at redhat.com>> wrote:
> >
> >Â Â Â Â Ã*Â Ã*Â Hi All,
> >
> >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â A few months tripleo switch from
> its devtest based CI to
> >Â Â Â Â one
> >Â Â Â Â Ã*Â Ã*Â that was based on instack. Before doing this we
> anticipated
> >Â Â Â Â Ã*Â Ã*Â disruption in the ci jobs and removed them from
> non tripleo
> >Â Â Â Â projects.
> >
> >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â We'd like to investigate adding it
> back to heat and
> >Â Â Â Â ironic as
> >Â Â Â Â Ã*Â Ã*Â these are the two projects where we find our ci
> provides the
> >Â Â Â Â most
> >Â Â Â Â Ã*Â Ã*Â value. But we can only do this if the results
> from the job are
> >Â Â Â Â Ã*Â Ã*Â treated as voting.
> >
> >Â Â Â Â What does this mean? That the tripleo job could vote and do
> a -1 and
> >Â Â Â Â block ironic's gate?
> >
> >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â In the past most of the non tripleo
> projects tended to
> >Â Â Â Â ignore
> >Â Â Â Â Ã*Â Ã*Â the results from the tripleo job as it wasn't
> unusual for the
> >Â Â Â Â job to
> >Â Â Â Â Ã*Â Ã*Â broken for days at a time. The thing is, ignoring
> the results of
> >Â Â Â Â the
> >Â Â Â Â Ã*Â Ã*Â job is the reason (the majority of the time) it
> was broken in
> >Â Â Â Â the
> >Â Â Â Â Ã*Â Ã*Â first place.
> >Â Â Â Â Ã*Â Ã*Â Ã*Â Ã*Â Ã*Â To decrease the number of breakages
> we are now no longer
> >Â Â Â Â Ã*Â Ã*Â running master code for everything (for the non
> tripleo projects
> >Â Â Â Â we
> >Â Â Â Â Ã*Â Ã*Â bump the versions we use periodically if they are
> working). I
> >Â Â Â Â Ã*Â Ã*Â believe with this model the CI jobs we run have
> become a lot
> >Â Â Â Â more
> >Â Â Â Â Ã*Â Ã*Â reliable, there are still breakages but far less
> frequently.
> >
> >Â Â Â Â Ã*Â Ã*Â What I proposing is we add at least one of our
> tripleo jobs back
> >Â Â Â Â to
> >Â Â Â Â Ã*Â Ã*Â both heat and ironic (and other projects
> associated with them
> >Â Â Â Â e.g.
> >Â Â Â Â Ã*Â Ã*Â clients, ironicinspector etc..), tripleo will
> switch to running
> >Â Â Â Â Ã*Â Ã*Â latest master of those repositories and the cores
> approving on
> >Â Â Â Â those
> >Â Â Â Â Ã*Â Ã*Â projects should wait for a passing CI jobs before
> hitting
> >Â Â Â Â approve.
> >Â Â Â Â Ã*Â Ã*Â So how do people feel about doing this? can we
> give it a go? A
> >Â Â Â Â Ã*Â Ã*Â couple of people have already expressed an
> interest in doing
> >Â Â Â Â this
> >Â Â Â Â Ã*Â Ã*Â but I'd like to make sure were all in agreement
> before switching
> >Â Â Â Â it on.
> >
> >Â Â Â Â This seems to indicate that the tripleo jobs are
> non-voting, or at
> >Â Â Â Â least
> >Â Â Â Â won't block the gate -- so I'm fine with adding tripleo
> jobs to
> >Â Â Â Â ironic.
> >Â Â Â Â But if you want cores to wait/make sure they pass, then
> shouldn't they
> >Â Â Â Â be voting? (Guess I'm a bit confused.)
> >
> >Â Â Â +1
> >
> >Â Â Â I don't think it hurts to turn it on, but tbh I'm
> uncomfortable with the
> >Â Â Â mental overhead of a non-voting job that I have to manually
> treat as a
> >Â Â Â voting job. If it's stable enough to make it a voting job, I'd
> prefer we
> >Â Â Â just make it voting. And if it's not then I'd like to see it
> be made
> >Â Â Â stable enough to be a voting job and then make it voting.
> >
> >Â Â This is roughly where I sit as well -- if it's non-voting,
> experience
> >Â Â tells me that it will largely be ignored, and as such, isn't a
> good use of
> >Â Â resources.
>
> I'm sure you can appreciate it's something of a chicken/egg problem
> though
> - if everyone always ignores non-voting jobs, they never become voting.
>
> That effect is magnified with TripleO though, because it consumes so
> many
> OpenStack projects, any one of which has the capability to break our CI,
> so
> in an ideal world we'd have voting feedback on all-the-things, but
> that's
> not where we are right now due in large-part to the steady stream of
> regressions (from Heat, Ironic and other projects).
> >Â Â I haven't looked at tripleo or tripleoci in a while, so I wont
> assume that
> >Â Â my recollection of the CI jobs bears any resemblance to what
> exists today.
> >Â Â Could you explain what areas of ironic (or its subprojects) will
> be
> >Â Â covered by these tests?Ã*Â If they are already covered by
> existing tests,
> >Â Â then I don't see the benefit of adding another job; conversely,
> if this is
> >Â Â testing areas we don't cover today, then there's probably value
> in running
> >Â Â tripleoci in a voting fashion for now and then moving that
> coverage into
> >Â Â ironic's project testing.
>
> I like to think of TripleO as a trunk-chasing "power user", and as such
> gives very valuable "user" feedback, including breaking things in
> exciting
> ways you hadn't anticipated in your project integration tests.
>
> This has, in the case of Heat at least, made TripleO an extremely
> effective
> "kitchen sink" stress test, and has uncovered numerous issues we failed
> to
> find with out internal tests (obviously we do add coverage when we find
> them).
>
> In the case of Ironic, I think the usage is somewhat less demanding, but
> no
> less "real world" - here's a good example for you:
>
> https://bugs.launchpad.net/ironic/+bug/1507738
>
> In this case, Ironic landed a change to master, which broke all existing
> deployments using Centos/RHEL derived distributions, so master Ironic
> has
> been broken for folks using those distros for over 6 weeks.
>
> I know in that case, the problem was really old ipxe image in the
> distro,
> and yes there were several possible workarounds, but as a developer who
> cares about users, I personally would rather get gate feedback than
> angry
> users on IRC/email when I unwittingly break the world for them ;)
>
> (note, I'm not assigning any blame above, it's one of *many* examples of
> unexpected breakage due to insufficient gate feedback of real usage
> accross
> many projects).
>
> Great example, Steve, and I agree that more and faster feedback from users
> into patches is a good thing. I'm also sad that it was broken for that
> long and no one raised the issue in our meeting until this week.
> This particular bug highlights a gap in Ironic's test coverage which I
> would be delighted if someone wants to close -- that we aren't testing
> support for RH-based distros. Closing that gap doesn't require TripleoCI
> at all; we should simply add a dsvm job for Ironic on Fedora, using a
> Fedora-based ramdisk. That will help prevent similar regressions in the
> future.
> Anyway, I have big reservations about putting TripleoCI on a path to ever
> gating Ironic patches. I started to bikeshed on that and then deleted it
> ... tldr; I believe it is important for this job to vote in a non-gating
> way. As a reviewer, I'm unlikely to pay attention to it if it doesn't
> vote, and there's a good reason for this:
> Non-voting jobs are used for experimentation. A non-voting job is a job
> that we want to vote, but which we don't trust enough yet. It has been
> promoted from the experimental pipeline to the check pipeline so that it
> gets a lot more runs and so that we can stabilize it enough to make it
> voting.
Ah, I think all we have here is a terminology mismatch around "non voting"
vs "non gating".
AFAIK what is being proposed is to reinstate the TripleO jobs so they *do*
vote on any change (+1/-1), but they do not block the gate, so we won't get
in the way if occasional outages happen.
> I was going to suggest that tripleoci vote as a third party CI system (I
> know, it's not actually a third-party CI system, but I'd like to vote like
> one). And then I noticed that it used to do just that. [0] If I'm
> interpreting it correctly, the "gate-tripleo-ironic*" jobs voted from a
> separate account, left an informative -1, but did not block the gate.
> That's exactly what I would like in this case.
+1, I think that's what's being proposed, so we're in agreement! :)
Steve
More information about the OpenStack-dev
mailing list