just wanted to point out the 'node hours' comparison may not be fair because what is a typical nova patch or a typical tripleo patch? The number of jobs matched & executed by zuul on a given review will be different to another tripleo patch in the same repo depending on the files touched or branch (etc.) and will vary even more compared to other tripleo repos; I think this is the same for nova or any other project with multiple repos.
It is indeed important to note that some projects may have wildly different numbers depending on what is touched in the patch. Speaking from experience with Nova, Glance, and QA, most job runs are going to be the same for anything that touches code. Nova will only run unit or functional tests if those are the only files you touched, or docs if so, but otherwise we're pretty much running everything all the time, AFAIK. That could be an area for improvement for us, although I think that determining the scope by the file changed is hard for us just because of how intertwined things are, so we probably need to figure out how to target our tests another way. And basically all of Nova is in a single repo. But yes, totally fair point. I picked a couple test runs at random to generate these numbers, based on looking like they were running most/all of what is configured. First time I did that I picked a stable Neutron patch from before they dropped some testing and got a sky-high number of 54h for a single patch run. So clearly it can vary :)
ACK. We have recently completed some work (as I said, this is an ongoing issue/process for us) at [1][2] to remove some redundant jobs which should start to help. Mohamed (mnaser o/) has reached out about this and joined our most recent irc meeting [3]. We're already prioritized some more cleanup work for this sprint including checking file patterns (e.g. started at [4]), tempest tests and removing many/all of our non-voting jobs as a first pass. Hope that at least starts to address you concern,
Yep, and thanks a lot for what you've done and continue to do. Obviously looking at the "tripleo is ~40%" report, I expected my script to show tripleo as having some insanely high test load. Looking at the actual numbers, it's clear that you're not only not the heaviest, but given what we know to be a super heavy process of deploying nodes like you do, seemingly relatively efficient. I'm sure there's still improvement that could be made on top of your current list, but I think the lesson in these numbers is that we definitely need to look elsewhere than the traditional openstack pastime of blaming tripleo ;) For my part so far, I've got a stack of patches proposed to make devstack run quite a bit faster for jobs that use it: https://review.opendev.org/q/topic:%2522async%2522+status:open+project:opens... and I've also proposed that nova stop running two grenades which almost 100% overlap (which strangely has to be a change in the tempest repo): https://review.opendev.org/c/openstack/tempest/+/771499 Both of these have barriers to approval at the moment, but both have big multipliers capable of making a difference. --Dan