[openstack-dev] [nova] A modest proposal to reduce reviewer load

Matthew Booth mbooth at redhat.com
Thu Jun 19 08:34:31 UTC 2014

On 19/06/14 08:32, Mark McLoughlin wrote:
> Hi Armando,
> On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote:
>> I wonder what the turnaround of trivial patches actually is, I bet you
>> it's very very small, and as Daniel said, the human burden is rather
>> minimal (I would be more concerned about slowing them down in the
>> gate, but I digress).
>> I think that introducing a two-tier level for patch approval can only
>> mitigate the problem, but I wonder if we'd need to go a lot further,
>> and rather figure out a way to borrow concepts from queueing theory so
>> that they can be applied in the context of Gerrit. For instance
>> Little's law [1] says:
>> "The long-term average number of customers (in this context reviews)
>> in a stable system L is equal to the long-term average effective
>> arrival rate, λ, multiplied by the average time a customer spends in
>> the system, W; or expressed algebraically: L = λW."
>> L can be used to determine the number of core reviewers that a project
>> will need at any given time, in order to meet a certain arrival rate
>> and average time spent in the queue. If the number of core reviewers
>> is a lot less than L then that core team is understaffed and will need
>> to increase.
>> If we figured out how to model and measure Gerrit as a queuing system,
>> then we could improve its performance a lot more effectively; for
>> instance, this idea of privileging trivial patches over longer patches
>> has roots in a popular scheduling policy [3] for  M/G/1 queues, but
>> that does not really help aging of 'longer service time' patches and
>> does not have a preemption mechanism built-in to avoid starvation. 
>> Just a crazy opinion...
>> Armando
>> [1] - http://en.wikipedia.org/wiki/Little's_law
>> [2] - http://en.wikipedia.org/wiki/Shortest_job_first
>> [3] - http://en.wikipedia.org/wiki/M/G/1_queue
> This isn't crazy at all. We do have a problem that surely could be
> studied and solved/improved by applying queueing theory or lessons from
> fields like lean manufacturing. Right now, we're simply applying our
> intuition and the little I've read about these sorts of problems is that
> your intuition can easily take you down the wrong path.
> There's a bunch of things that occur just glancing through those
> articles:
>   - Do we have an unstable system? Would it be useful to have arrival 
>     and exit rate metrics to help highlight this? Over what time period 
>     would those rates need to be averaged to be useful? Daily, weekly, 
>     monthly, an entire release cycle?
>   - What are we trying to optimize for? The length of time in the 
>     queue? The number of patches waiting in the queue? The response 
>     time to a new patch revision?
>   - We have a single queue, with a bunch of service nodes with a wide 
>     variance between their service rates, very little in the way of
>     scheduling policy, a huge rate of service nodes sending jobs back 
>     for rework, a cost associated with maintaining a job while it sits 
>     in the queue, the tendency for some jobs to disrupt many other jobs 
>     with merge conflicts ... not simple.
>   - Is there any sort of natural limit in our queue size that makes the 
>     system stable - e.g. do people naturally just stop submitting
>     patches at some point?
> My intuition on all of this lately is that we need some way to model and
> experiment with this queue, and I think we could make some interesting
> progress if we could turn it into a queueing network rather than a
> single, extremely complex queue.
> Say we had a front-end for gerrit which tracked which queue a patch is
> in, we could experiment with things like:
>   - a triage queue, with non-cores signed up as triagers looking for 
>     obvious mistakes and choosing the next queue for a patch to enter 
>     into
>   - queues having a small number of cores signed up as owners - e.g. 
>     high priority bugfix, API, scheduler, object conversion, libvirt
>     driver, vmware driver, etc.
>   - we'd allow for a large number of queues so that cores could aim for 
>     an "inbox zero" approach on individual queues, something that would 
>     probably help keep cores motivated.
>   - we could apply different scheduling policies to each of the 
>     different queues - i.e. explicit guidance for cores about which 
>     patches they should pick off the queue next.
>   - we could track metrics on individual queues as well as the whole 
>     network, identifying bottlenecks and properly recognizing which 
>     reviewers are doing a small number of difficult reviews versus 
>     those doing a high number of trivial reviews.
>   - we could require some queues to feed into a final approval queue 
>     where some people are responsible for giving an approved patch a 
>     final sanity check - i.e. there would be a class of reviewer with 
>     good instincts who quickly churn through already-reviewed patches 
>     looking for the kind of mistakes people tend to mistake when 
>     they're down in the weeds.
>   - explicit queues for large, cross-cutting changes like coding style 
>     changes. Perhaps we could stop servicing these queues at certain
>     points in the cycles, or reduce the rate at which they are 
>     serviced.
>   - we could include specs and client patches in the same network so 
>     that they prioritized in the same way.
> Lots of ideas, none of it is trivial ... but perhaps it'll spark
> someone's interest :)

This is all good stuff, but by the sounds of it experimenting in gerrit
isn't likely to be simple.

Remember, though, that the relevant metric is code quality, not review rate.

Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

More information about the OpenStack-dev mailing list