[infra] A change to Zuul's queuing behavior

3 Dec 2018

      Hi,

We recently made a change to how Zuul and Nodepool prioritize node
requests.  Cloud resources are the major constraint in how long it takes
Zuul to run test jobs on proposed changes.  Because we're using more
resources than ever before (but not necessarily because we're doing more
work -- Clark has been helping to identify inefficiencies in other
mailing list threads), the amount of time it takes to receive results on
a change has been increasing.

Since some larger projects consume the bulk of cloud resources in our
system, this can be especially frustrating for smaller projects.  To be
sure, it impacts everyone, but while larger projects receive a
continuous stream of results (even if delayed) smaller projects may wait
hours before seeing results on a single change.

In order to help all projects maintain a minimal velocity, we've begun
dynamically prioritizing node requests based on the number of changes a
project has in a given pipeline.

This means that the first change for every project in the check pipeline
has the same priority.  The same is true for the second change of each
project in the pipeline.  The result is that if a project has 50 changes
in check, and another project has a single change in check, the second
project won't have to wait for all 50 changes ahead before it gets nodes
allocated.  As conditions change (requests are fulfilled, changes are
added and removed) the priorities of any unfulfilled requests are
adjusted accordingly.

In the gate pipeline, the grouping is by shared change queue.  But the
gate pipeline still has a higher overall precedence than check.

We hope that this will make for a significant improvement in the
experience for smaller projects without causing undue hardship for
larger ones.  We will be closely observing the new behavior and make any
necessary tuning adjustments over the next few weeks.  Please let us
know if you see any adverse impacts, but don't be surprised if you
notice node requests being filled "out of order".

-Jim

corvus＠inaugust.com

Chris Friesen

Sean Mooney

Matt Riedemann

corvus＠inaugust.com

Ghanshyam Mann

Jeremy Stanley

tags

participants (6)