[all][TC] Stats about rechecking patches without reason given

Ghanshyam Mann gmann at ghanshyammann.com
Fri Jul 1 16:23:05 UTC 2022

 ---- On Fri, 01 Jul 2022 01:46:23 -0500  Matthias Runge <mrunge at matthias-runge.de> wrote --- 
 > On 30/06/2022 20:06, Dan Smith wrote:
 > >>> Or vice versa, if there are 20 rechecks for 2 patches, even if neither
 > >>> of them are bare, it's still weird and smth worth reconsidering from
 > >>> project perspective.
 > >>
 > >> I think the idea is to create a culture of debugging and record
 > >> keeping. Yes, I would expect after a few rechecks that maybe the root
 > >> causes would be addressed in this case, but the first step in doing
 > >> that is identifying the problem and making note of it.
 > > 
 > > Right, that is the goal. Asking for a message at least sets the
 > > expectation that people are looking at the reasons for the fails. Just
 > > because they don't doesn't mean they aren't, or don't care, but I think
 > > it helps reinforce the desired behavior. If nothing else, it also helps
 > > observers realize "huh, I've seen a bunch of rechecks about $reason
 > > lately, maybe we should look at that".
 > So, what happens with the script, when you add 2 comments, one: "network 
 > error during package install, let's try again" and the next message 
 > "recheck".

In this case, you can always mentione the "recheck network error during package install, let's try again"
or if you have added a lenthy text for failure and then want to recheck you can add a one line sumamry
during recheck.

Overall idea is not to literally count the bare recheck but to build a habbit among us that we should
look at the failure before we just do recheck.

 > In my understanding, that would count as recheck without reason given.
 > (by the script). Maybe it's worth to document how to give a better proof 
 > that someone looked into the logs and tried to get to the root cause of 
 > a previous CI failure?

I think Dan has written a nice document about it including how to debug the failure,

- https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures

We welcome everyone to extend it to have more detail or example if any case specific to
projects and it is not covered.


 > The other issue I see here is that with CI being flaky, chances seem to 
 > get better when doing a recheck.
 > An extreme example: 
 > https://review.opendev.org/c/openstack/tripleo-heat-templates/+/844519
 > required 8 rechecks, no changes in the patch itself, and no 
 > dependencies. The CI failed always in different checks.
 > Matthias

More information about the openstack-discuss mailing list