<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Dec 1, 2015 at 3:22 AM, Steven Hardy <span dir="ltr"><<a href="mailto:shardy@redhat.com" target="_blank">shardy@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">On Mon, Nov 30, 2015 at 03:35:13PM -0800, Devananda van der Veen wrote:<br>
> On Mon, Nov 30, 2015 at 3:07 PM, Zane Bitter <<a href="mailto:zbitter@redhat.com">zbitter@redhat.com</a>> wrote:<br>
><br>
> On 30/11/15 12:51, Ruby Loo wrote:<br>
><br>
> On 30 November 2015 at 10:19, Derek Higgins <<a href="mailto:derekh@redhat.com">derekh@redhat.com</a><br>
> <mailto:<a href="mailto:derekh@redhat.com">derekh@redhat.com</a>>> wrote:<br>
><br>
</span>> Â Â Hi All,<br>
><br>
> Â Â Â Â Â A few months tripleo switch from its devtest based CI to<br>
> one<br>
> Â Â that was based on instack. Before doing this we anticipated<br>
> Â Â disruption in the ci jobs and removed them from non tripleo<br>
> projects.<br>
><br>
> Â Â Â Â Â We'd like to investigate adding it back to heat and<br>
> ironic as<br>
> Â Â these are the two projects where we find our ci provides the<br>
> most<br>
> Â Â value. But we can only do this if the results from the job are<br>
> Â Â treated as voting.<br>
<span class="">><br>
> What does this mean? That the tripleo job could vote and do a -1 and<br>
> block ironic's gate?<br>
><br>
</span>> Â Â Â Â Â In the past most of the non tripleo projects tended to<br>
> ignore<br>
> Â Â the results from the tripleo job as it wasn't unusual for the<br>
> job to<br>
> Â Â broken for days at a time. The thing is, ignoring the results of<br>
> the<br>
> Â Â job is the reason (the majority of the time) it was broken in<br>
> the<br>
> Â Â first place.<br>
> Â Â Â Â Â To decrease the number of breakages we are now no longer<br>
> Â Â running master code for everything (for the non tripleo projects<br>
> we<br>
> Â Â bump the versions we use periodically if they are working). I<br>
> Â Â believe with this model the CI jobs we run have become a lot<br>
> more<br>
> Â Â reliable, there are still breakages but far less frequently.<br>
><br>
> Â Â What I proposing is we add at least one of our tripleo jobs back<br>
> to<br>
> Â Â both heat and ironic (and other projects associated with them<br>
> e.g.<br>
> Â Â clients, ironicinspector etc..), tripleo will switch to running<br>
> Â Â latest master of those repositories and the cores approving on<br>
> those<br>
> Â Â projects should wait for a passing CI jobs before hitting<br>
> approve.<br>
> Â Â So how do people feel about doing this? can we give it a go? A<br>
> Â Â couple of people have already expressed an interest in doing<br>
> this<br>
> Â Â but I'd like to make sure were all in agreement before switching<br>
<span class="">> it on.<br>
><br>
> This seems to indicate that the tripleo jobs are non-voting, or at<br>
> least<br>
> won't block the gate -- so I'm fine with adding tripleo jobs to<br>
> ironic.<br>
> But if you want cores to wait/make sure they pass, then shouldn't they<br>
> be voting? (Guess I'm a bit confused.)<br>
><br>
> +1<br>
><br>
> I don't think it hurts to turn it on, but tbh I'm uncomfortable with the<br>
> mental overhead of a non-voting job that I have to manually treat as a<br>
> voting job. If it's stable enough to make it a voting job, I'd prefer we<br>
> just make it voting. And if it's not then I'd like to see it be made<br>
> stable enough to be a voting job and then make it voting.<br>
><br>
> This is roughly where I sit as well -- if it's non-voting, experience<br>
> tells me that it will largely be ignored, and as such, isn't a good use of<br>
> resources.<br>
<br>
</span>I'm sure you can appreciate it's something of a chicken/egg problem though<br>
- if everyone always ignores non-voting jobs, they never become voting.<br>
<br>
That effect is magnified with TripleO though, because it consumes so many<br>
OpenStack projects, any one of which has the capability to break our CI, so<br>
in an ideal world we'd have voting feedback on all-the-things, but that's<br>
not where we are right now due in large-part to the steady stream of<br>
regressions (from Heat, Ironic and other projects).<br>
<span class=""><br>
> I haven't looked at tripleo or tripleoci in a while, so I wont assume that<br>
> my recollection of the CI jobs bears any resemblance to what exists today.<br>
> Could you explain what areas of ironic (or its subprojects) will be<br>
</span>> covered by these tests? If they are already covered by existing tests,<br>
<span class="">> then I don't see the benefit of adding another job; conversely, if this is<br>
> testing areas we don't cover today, then there's probably value in running<br>
> tripleoci in a voting fashion for now and then moving that coverage into<br>
> ironic's project testing.<br>
<br>
</span>I like to think of TripleO as a trunk-chasing "power user", and as such<br>
gives very valuable "user" feedback, including breaking things in exciting<br>
ways you hadn't anticipated in your project integration tests.<br>
<br>
This has, in the case of Heat at least, made TripleO an extremely effective<br>
"kitchen sink" stress test, and has uncovered numerous issues we failed to<br>
find with out internal tests (obviously we do add coverage when we find<br>
them).<br>
<br>
In the case of Ironic, I think the usage is somewhat less demanding, but no<br>
less "real world" - here's a good example for you:<br>
<br>
<a href="https://bugs.launchpad.net/ironic/+bug/1507738" rel="noreferrer" target="_blank">https://bugs.launchpad.net/ironic/+bug/1507738</a><br>
<br>
In this case, Ironic landed a change to master, which broke all existing<br>
deployments using Centos/RHEL derived distributions, so master Ironic has<br>
been broken for folks using those distros for over 6 weeks.<br>
<br>
I know in that case, the problem was really old ipxe image in the distro,<br>
and yes there were several possible workarounds, but as a developer who<br>
cares about users, I personally would rather get gate feedback than angry<br>
users on IRC/email when I unwittingly break the world for them ;)<br>
<br>
(note, I'm not assigning any blame above, it's one of *many* examples of<br>
unexpected breakage due to insufficient gate feedback of real usage accross<br>
many projects).<br></blockquote><div><br></div><div>Great example, Steve, and I agree that more and faster feedback from users into patches is a good thing. I'm also sad that it was broken for that long and no one raised the issue in our meeting until this week.</div><div><br></div><div>This particular bug highlights a gap in Ironic's test coverage which I would be delighted if someone wants to close -- that we aren't testing support for RH-based distros. Closing that gap doesn't require TripleoCI at all; we should simply add a dsvm job for Ironic on Fedora, using a Fedora-based ramdisk. That will help prevent similar regressions in the future.</div><div><br></div><div>Anyway, I have big reservations about putting TripleoCI on a path to ever gating Ironic patches. I started to bikeshed on that and then deleted it ... tldr; I believe it is important for this job to vote in a non-gating way. As a reviewer, I'm unlikely to pay attention to it if it doesn't vote, and there's a good reason for this:</div><div><br></div><div>Non-voting jobs are used for experimentation. A non-voting job is a job that we want to vote, but which we don't trust enough yet. It has been promoted from the experimental pipeline to the check pipeline so that it gets a lot more runs and so that we can stabilize it enough to make it voting.</div><div><br></div><div>I was going to suggest that tripleoci vote as a third party CI system (I know, it's not actually a third-party CI system, but I'd like to vote like one). And then I noticed that it used to do just that. [0] If I'm interpreting it correctly, the "gate-tripleo-ironic*" jobs voted from a separate account, left an informative -1, but did not block the gate. That's exactly what I would like in this case.</div><div><br></div><div><br></div><div>Cheers,</div><div>-Devananda</div><div><br></div><div>[0] <a href="https://review.openstack.org/#/c/184402/">https://review.openstack.org/#/c/184402/</a></div></div></div></div>