On 2021-02-07 09:28:35 +0200 (+0200), Dmitriy Rabotyagov wrote:
Once you said that, I looked through the actual code of the prepare-workspace-git role more carefully and you're right - all actions are made against already cached repos there. However since it mostly uses commands, it would still be the way more efficient to make up some module to replace all commands/shell to run things in multiprocess way. Regarding example, you can take any random task from osa, ie [1] - it takes a bit more then 6 mins. When load on providers is high (or their volume backend io is poor), time increases [...]
Okay, so that's these tasks: https://opendev.org/zuul/zuul-jobs/src/commit/8bdb2b538c79dd75bac14180b905a1... https://opendev.org/zuul/zuul-jobs/src/commit/8bdb2b538c79dd75bac14180b905a1... It's doing a git clone from the cache on the node into the workspace (in theory from one path to another within the same filesystem, which should normally just result in git creating hardlinks to the original objects/packs), and that took 101 seconds to clone 106 repositories. After that, 83 seconds were spent fixing up configuration on each of those clones. The longest step does indeed seem to be the 128 seconds where it pushed updated refs from the cache on the executor over the network into the prepared workspace on the remote build node. I wonder if combining these into a single loop could help reduce the iteration overhead, or whether processing repositories in parallel would help (if they're limited by I/O bandwidth then I expect not)? Regardless, yeah, 5m12s does seem like a good chunk of time. On the other hand, it's worth keeping in mind that's just shy of 3 seconds per required-project so like you say, it's mainly impacting jobs with a massive number of required-projects. A different approach might be to revisit the list of required-projects for that job and check whether they're all actually used. -- Jeremy Stanley