On Tue, Jul 9, 2024, at 1:37 AM, smooney@redhat.com wrote:
Hi,
Currently, it looks like stestr calculate which thread should run which test at the beginning, by calculating some partition, and then launch all tests at once. With a lot of threads, the result is, at the end, only a few cores are in use, with all the others being idle. currently unless you overried it my understanidn is the deistrubtion of tests is done on the class level in a round robbin maner acroos the worker threads. as you said this is done prior to lauching the worker thread staticly
On Tue, 2024-07-09 at 04:38 +0200, Thomas Goirand wrote: pre start up by generating a file with the relevent distibution.
The docs [0] say this "Currently the partitioning algorithm is simple round-robin for tests that stestr has not seen run before, and equal-time buckets for tests that stestr has seen run." which maintains the old testr behavior. Basically if there is run information in the stestr database it should use that to bucket tests more evenly. Are you seeing this behavior in a fresh checkout or with existing data in your database? One option for CI would be to record historical runs and preseed the database in fresh checkouts with that information.
Would there be a way to have a pool of available threads instead, and have stestr to give threads something to eat when they are available, instead of the current way?
i think based on how this currnelty works that would require use to repealty spawn thread and genrate new workers after every class. effectivly pregenerate a set of task files at start up and when one thread complete grab the next file an lauch a new thread.
if you actully wanted to do this with a thread pool and dispatch tests into that i think i would be a larger rewrite.
How much work would that be?
im not very familar with the workings of this although i have had to debug it once or twice a few years ago due to gate issues but im not conviced this would be that easy to do in a more dynmic way you could likely hack toghete ther appoch of generateing may test list files and spawning thread up to n wiht less work then properly usign a thread pool and quing the test units but i suspect both would be more then a couple of hours work but i dont know if thats days or weeks. it likely depens on how familar people are with stestr, unfortuenlly there are not many that are.
Cheers,
Thomas Goirand (zigo)
[0] https://stestr.readthedocs.io/en/latest/MANUAL.html#parallel-testing