[placement][tripleo][infra] zuul job dependencies for greater good?

Bogdan Dobrelya bdobreli at redhat.com
Thu Feb 28 14:16:48 UTC 2019


> On Mon, 2019-02-25 at 19:42 -0500, Clark Boylan wrote:
>> On Mon, Feb 25, 2019, at 12:51 PM, Ben Nemec wrote:
>>
> 
> snip
> 
>> That said, I wouldn't push too hard in either direction until someone 
>> crunched the numbers and figured out how much time it would have saved 
>> to not run long tests on patch sets with failing unit tests. I feel like 
>> it's probably possible to figure that out, and if so then we should do 
>> it before making any big decisions on this.
> 
> For numbers the elastic-recheck tool [0] gives us fairly accurate tracking of which issues in the system cause tests to fail. You can use this as a starting point to potentially figure out how expensive indentation errors caught by the pep8 jobs ends up being or how often unittests fail. You probably need to tweak the queries there to get that specific though.
> 
> Periodically I also dump node resource utilization by project, repo, and job [1]. I haven't automated this because Tobiash has written a much better thing that has Zuul inject this into graphite and we should be able to set up a grafana dashboard for that in the future instead.
> 
> These numbers won't tell a whole story, but should paint a fairly accurate high level picture of the types of things we should look at to be more node efficient and "time in gate" efficient. Looking at these two really quickly myself it seems that job timeouts are a big cost (anyone looking into why our jobs timeout?).
> 
> [0] http://status.openstack.org/elastic-recheck/index.html
> [1] http://paste.openstack.org/show/746083/
> 
> Hope this helps,
> Clark

Here is some numbers [0] extracted via elastic-recheck console queries. 
It shows 6% of wasted failures because of tox issues in general, and 3% 
for tripleo projects in particular.

My final take is, given some middle-ground solution, like I illustrated 
earlier this sub-thread, it might be worth it, and the effort for 
boosting up the total throughput of openstack CI system by a 6% is not 
so bad idea.

[0] http://paste.openstack.org/show/746503/


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the openstack-discuss mailing list