Re: [TC][all] Recheck analysis

21 Mar 2024

      ...
i think it would be better to just stop tracking this
and i don't think enforcing this in code is a good thing either.
people that don't care will just work around it so unless we are going
to soft ban an account for a few days or something like that i dont think it will
have much impact.
I think maybe you’re focusing on the tracking of the reason and less on the overall goal. The reason to actually obsess over why people are rechecking, and also to ask them to provide a reason, is not really so we can correlate reasons to fixes (at least IMHO). As Jeremy has said, that hasn’t borne fruit in the past, despite us very much wishing it would.

To me, the point of doing this exercise is to strongly encourage people to look at the cause of the failure _at_all_. Meaning, open the logs, check the test reports, and at least synthesize some “reason” that indicates that they even looked. That has the benefit of easing people into figuring out how to debug CI issues, and caring about the high failure rates. Maybe while they’re doing that they’ll see a traceback that looks fishy, or actually find something that is out of step with reality. It doesn’t always of course, but I think we have in the past gotten stuck in situations where people just recheck if they get a -1 from zuul and just assume that it’s not their fault. That collective dis-ownership of the problem is a disease that leads us to a completely non-functional gate. I’ve caught people multiple times rechecking failures that clearly show a test they added, or a test their code changes, failing the same way over and over. Even in less-obvious situations, much can be gained by at least exposing people to the causes instead of just having them assume someone else will fix it.

So, per the above, I wholeheartedly disagree that this is a useless exercise. Since we started doing this, I’ve definitely noticed (admittedly, anecdotally) more collective awareness of the kinds of issues that cause us trouble. I really hope we don’t “stop tracking this” for that reason.

—-Dan

Re: [TC][all] Recheck analysis

Dan Smith