[openstack-dev] [nova] A modest proposal to reduce reviewer load

Mark McLoughlin markmc at redhat.com
Thu Jun 19 12:22:53 UTC 2014


On Thu, 2014-06-19 at 09:34 +0100, Matthew Booth wrote:
> On 19/06/14 08:32, Mark McLoughlin wrote:
> > Hi Armando,
> > 
> > On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote:
> >> I wonder what the turnaround of trivial patches actually is, I bet you
> >> it's very very small, and as Daniel said, the human burden is rather
> >> minimal (I would be more concerned about slowing them down in the
> >> gate, but I digress).
> > 
> >>
> >> I think that introducing a two-tier level for patch approval can only
> >> mitigate the problem, but I wonder if we'd need to go a lot further,
> >> and rather figure out a way to borrow concepts from queueing theory so
> >> that they can be applied in the context of Gerrit. For instance
> >> Little's law [1] says:
> >>
> >> "The long-term average number of customers (in this context reviews)
> >> in a stable system L is equal to the long-term average effective
> >> arrival rate, λ, multiplied by the average time a customer spends in
> >> the system, W; or expressed algebraically: L = λW."
> >>
> >> L can be used to determine the number of core reviewers that a project
> >> will need at any given time, in order to meet a certain arrival rate
> >> and average time spent in the queue. If the number of core reviewers
> >> is a lot less than L then that core team is understaffed and will need
> >> to increase.
> >>
> >> If we figured out how to model and measure Gerrit as a queuing system,
> >> then we could improve its performance a lot more effectively; for
> >> instance, this idea of privileging trivial patches over longer patches
> >> has roots in a popular scheduling policy [3] for  M/G/1 queues, but
> >> that does not really help aging of 'longer service time' patches and
> >> does not have a preemption mechanism built-in to avoid starvation. 
> >>
> >> Just a crazy opinion...
> >> Armando
> >>
> >> [1] - http://en.wikipedia.org/wiki/Little's_law
> >> [2] - http://en.wikipedia.org/wiki/Shortest_job_first
> >> [3] - http://en.wikipedia.org/wiki/M/G/1_queue
> > 
> > This isn't crazy at all. We do have a problem that surely could be
> > studied and solved/improved by applying queueing theory or lessons from
> > fields like lean manufacturing. Right now, we're simply applying our
> > intuition and the little I've read about these sorts of problems is that
> > your intuition can easily take you down the wrong path.
> > 
> > There's a bunch of things that occur just glancing through those
> > articles:
> > 
> >   - Do we have an unstable system? Would it be useful to have arrival 
> >     and exit rate metrics to help highlight this? Over what time period 
> >     would those rates need to be averaged to be useful? Daily, weekly, 
> >     monthly, an entire release cycle?
> > 
> >   - What are we trying to optimize for? The length of time in the 
> >     queue? The number of patches waiting in the queue? The response 
> >     time to a new patch revision?
> > 
> >   - We have a single queue, with a bunch of service nodes with a wide 
> >     variance between their service rates, very little in the way of
> >     scheduling policy, a huge rate of service nodes sending jobs back 
> >     for rework, a cost associated with maintaining a job while it sits 
> >     in the queue, the tendency for some jobs to disrupt many other jobs 
> >     with merge conflicts ... not simple.
> > 
> >   - Is there any sort of natural limit in our queue size that makes the 
> >     system stable - e.g. do people naturally just stop submitting
> >     patches at some point?
> > 
> > My intuition on all of this lately is that we need some way to model and
> > experiment with this queue, and I think we could make some interesting
> > progress if we could turn it into a queueing network rather than a
> > single, extremely complex queue.
> > 
> > Say we had a front-end for gerrit which tracked which queue a patch is
> > in, we could experiment with things like:
> > 
> >   - a triage queue, with non-cores signed up as triagers looking for 
> >     obvious mistakes and choosing the next queue for a patch to enter 
> >     into
> > 
> >   - queues having a small number of cores signed up as owners - e.g. 
> >     high priority bugfix, API, scheduler, object conversion, libvirt
> >     driver, vmware driver, etc.
> > 
> >   - we'd allow for a large number of queues so that cores could aim for 
> >     an "inbox zero" approach on individual queues, something that would 
> >     probably help keep cores motivated.
> > 
> >   - we could apply different scheduling policies to each of the 
> >     different queues - i.e. explicit guidance for cores about which 
> >     patches they should pick off the queue next.
> > 
> >   - we could track metrics on individual queues as well as the whole 
> >     network, identifying bottlenecks and properly recognizing which 
> >     reviewers are doing a small number of difficult reviews versus 
> >     those doing a high number of trivial reviews.
> > 
> >   - we could require some queues to feed into a final approval queue 
> >     where some people are responsible for giving an approved patch a 
> >     final sanity check - i.e. there would be a class of reviewer with 
> >     good instincts who quickly churn through already-reviewed patches 
> >     looking for the kind of mistakes people tend to mistake when 
> >     they're down in the weeds.
> > 
> >   - explicit queues for large, cross-cutting changes like coding style 
> >     changes. Perhaps we could stop servicing these queues at certain
> >     points in the cycles, or reduce the rate at which they are 
> >     serviced.
> > 
> >   - we could include specs and client patches in the same network so 
> >     that they prioritized in the same way.
> > 
> > Lots of ideas, none of it is trivial ... but perhaps it'll spark
> > someone's interest :)
> 
> This is all good stuff, but by the sounds of it experimenting in gerrit
> isn't likely to be simple.

Right, which is why I said "a front-end for gerrit" ... this would sit
in front of gerrit, at least to begin with.

> Remember, though, that the relevant metric is code quality, not review
> rate.

Not sure what conclusion you're driving towards there - obviously that's
true, but ... ignore review rate? measure code quality how?

Mark.




More information about the OpenStack-dev mailing list