[openstack-dev] Introducing the NNFI scheduler for Zuul

Jay Pipes jaypipes at gmail.com
Thu Sep 26 17:38:03 UTC 2013


On 09/26/2013 01:10 PM, James E. Blair wrote:
> We recently made a change to Zuul's scheduling algorithm (how it
> determines which changes to combine together and run tests).  Now when a
> change fails tests (or has a merge conflict), Zuul will move it out of
> the series of changes that it is stacking together to be tested, but it
> will still keep that change's position in the queue.  Jobs for changes
> behind it will be restarted without the failed change in their proposed
> repo states.  And if something later fails ahead of it, Zuul will once
> again put it back into the stream of changes it's testing and give it
> another chance.
>
> To visualize this, we've updated the status screen to include a tree
> view:
>
>    http://status.openstack.org/zuul/
>
> (If you already have that loaded, be sure to hit reload.)
>
> In Zuul, this is called the Nearest Non-Failing Item (NNFI) algorithm
> because in short, each item in a queue is at all times being tested
> based on the nearest non-failing item ahead of it in the queue.
>
> On the infrastructure side, this is going to drive our use of cloud
> resources even more, as Zuul will now try to run as many jobs as it can,
> continuously.  Every time a change fails, all of the jobs for changes
> behind it will be aborted and restarted with a new proposed future
> state.
>
> For developers, this means that changes should land faster, and more
> throughput overall, as Zuul won't be waiting as long to re-test changes
> after a job has failed.  And that's what this is ultimately about --
> virtual machines are cheap compared to developer time, so the more
> velocity our automated tests can sustain, the more velocity our project
> can achieve.
>
> -Jim

Just wanted to say great work on this to all involved, and thank you!

-jay




More information about the OpenStack-dev mailing list