[openstack-dev] [nova] A modest proposal to reduce reviewer load
Mark McLoughlin
markmc at redhat.com
Thu Jun 19 12:22:53 UTC 2014
On Thu, 2014-06-19 at 09:34 +0100, Matthew Booth wrote:
> On 19/06/14 08:32, Mark McLoughlin wrote:
> > Hi Armando,
> >
> > On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote:
> >> I wonder what the turnaround of trivial patches actually is, I bet you
> >> it's very very small, and as Daniel said, the human burden is rather
> >> minimal (I would be more concerned about slowing them down in the
> >> gate, but I digress).
> >
> >>
> >> I think that introducing a two-tier level for patch approval can only
> >> mitigate the problem, but I wonder if we'd need to go a lot further,
> >> and rather figure out a way to borrow concepts from queueing theory so
> >> that they can be applied in the context of Gerrit. For instance
> >> Little's law [1] says:
> >>
> >> "The long-term average number of customers (in this context reviews)
> >> in a stable system L is equal to the long-term average effective
> >> arrival rate, λ, multiplied by the average time a customer spends in
> >> the system, W; or expressed algebraically: L = λW."
> >>
> >> L can be used to determine the number of core reviewers that a project
> >> will need at any given time, in order to meet a certain arrival rate
> >> and average time spent in the queue. If the number of core reviewers
> >> is a lot less than L then that core team is understaffed and will need
> >> to increase.
> >>
> >> If we figured out how to model and measure Gerrit as a queuing system,
> >> then we could improve its performance a lot more effectively; for
> >> instance, this idea of privileging trivial patches over longer patches
> >> has roots in a popular scheduling policy [3] for M/G/1 queues, but
> >> that does not really help aging of 'longer service time' patches and
> >> does not have a preemption mechanism built-in to avoid starvation.
> >>
> >> Just a crazy opinion...
> >> Armando
> >>
> >> [1] - http://en.wikipedia.org/wiki/Little's_law
> >> [2] - http://en.wikipedia.org/wiki/Shortest_job_first
> >> [3] - http://en.wikipedia.org/wiki/M/G/1_queue
> >
> > This isn't crazy at all. We do have a problem that surely could be
> > studied and solved/improved by applying queueing theory or lessons from
> > fields like lean manufacturing. Right now, we're simply applying our
> > intuition and the little I've read about these sorts of problems is that
> > your intuition can easily take you down the wrong path.
> >
> > There's a bunch of things that occur just glancing through those
> > articles:
> >
> > - Do we have an unstable system? Would it be useful to have arrival
> > and exit rate metrics to help highlight this? Over what time period
> > would those rates need to be averaged to be useful? Daily, weekly,
> > monthly, an entire release cycle?
> >
> > - What are we trying to optimize for? The length of time in the
> > queue? The number of patches waiting in the queue? The response
> > time to a new patch revision?
> >
> > - We have a single queue, with a bunch of service nodes with a wide
> > variance between their service rates, very little in the way of
> > scheduling policy, a huge rate of service nodes sending jobs back
> > for rework, a cost associated with maintaining a job while it sits
> > in the queue, the tendency for some jobs to disrupt many other jobs
> > with merge conflicts ... not simple.
> >
> > - Is there any sort of natural limit in our queue size that makes the
> > system stable - e.g. do people naturally just stop submitting
> > patches at some point?
> >
> > My intuition on all of this lately is that we need some way to model and
> > experiment with this queue, and I think we could make some interesting
> > progress if we could turn it into a queueing network rather than a
> > single, extremely complex queue.
> >
> > Say we had a front-end for gerrit which tracked which queue a patch is
> > in, we could experiment with things like:
> >
> > - a triage queue, with non-cores signed up as triagers looking for
> > obvious mistakes and choosing the next queue for a patch to enter
> > into
> >
> > - queues having a small number of cores signed up as owners - e.g.
> > high priority bugfix, API, scheduler, object conversion, libvirt
> > driver, vmware driver, etc.
> >
> > - we'd allow for a large number of queues so that cores could aim for
> > an "inbox zero" approach on individual queues, something that would
> > probably help keep cores motivated.
> >
> > - we could apply different scheduling policies to each of the
> > different queues - i.e. explicit guidance for cores about which
> > patches they should pick off the queue next.
> >
> > - we could track metrics on individual queues as well as the whole
> > network, identifying bottlenecks and properly recognizing which
> > reviewers are doing a small number of difficult reviews versus
> > those doing a high number of trivial reviews.
> >
> > - we could require some queues to feed into a final approval queue
> > where some people are responsible for giving an approved patch a
> > final sanity check - i.e. there would be a class of reviewer with
> > good instincts who quickly churn through already-reviewed patches
> > looking for the kind of mistakes people tend to mistake when
> > they're down in the weeds.
> >
> > - explicit queues for large, cross-cutting changes like coding style
> > changes. Perhaps we could stop servicing these queues at certain
> > points in the cycles, or reduce the rate at which they are
> > serviced.
> >
> > - we could include specs and client patches in the same network so
> > that they prioritized in the same way.
> >
> > Lots of ideas, none of it is trivial ... but perhaps it'll spark
> > someone's interest :)
>
> This is all good stuff, but by the sounds of it experimenting in gerrit
> isn't likely to be simple.
Right, which is why I said "a front-end for gerrit" ... this would sit
in front of gerrit, at least to begin with.
> Remember, though, that the relevant metric is code quality, not review
> rate.
Not sure what conclusion you're driving towards there - obviously that's
true, but ... ignore review rate? measure code quality how?
Mark.
More information about the OpenStack-dev
mailing list