[openstack-dev] [nova] Is the BP approval process broken?

John Garbutt john at johngarbutt.com
Fri Aug 29 17:42:20 UTC 2014

I think this is now more about code reviews, but this is important...

On 29 August 2014 10:30, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Fri, Aug 29, 2014 at 11:07:33AM +0200, Thierry Carrez wrote:
>> Joe Gordon wrote:
>> > On Thu, Aug 28, 2014 at 2:43 PM, Alan Kavanagh
>> > <alan.kavanagh at ericsson.com <mailto:alan.kavanagh at ericsson.com>> wrote:
>> >
>> >>     I share Donald's points here, I believe what would help is to
>> >>     clearly describe in the Wiki the process and workflow for the BP
>> >>     approval process and build in this process how to deal with
>> >>     discrepancies/disagreements and build timeframes for each stage and
>> >>     process of appeal etc.
>> >>     The current process would benefit from some fine tuning and helping
>> >>     to build safe guards and time limits/deadlines so folks can expect
>> >>     responses within a reasonable time and not be left waiting in the cold.
>> >
>> > This is a resource problem, the nova team simply does not have enough
>> > people doing enough reviews to make this possible.
>> I think Nova lacks core reviewers more than it lacks reviewers, though.
>> Just looking at the ratio of core developers vs. patchsets proposed,
>> it's pretty clear that the core team is too small:
>> Nova: 750 patchsets/month for 21 core = 36
>> Heat: 230/14 = 16
>> Swift: 50/16 = 3
>> Neutron has the same issue (550/14 = 39). I think above 20, you have a
>> dysfunctional setup. No amount of process, spec, or runway will solve
>> that fundamental issue.


>> The problem is, you can't just add core reviewers, they have to actually
>> understand enough of the code base to be trusted with that +2 power. All
>> potential candidates are probably already in. In Nova, the code base is
>> so big it's difficult to find people that know enough of it. In Neutron,
>> the contributors are often focused on subsections of the code base so
>> they are not really interested in learning enough of the rest. That
>> makes the pool of core candidates quite dry.

The other point is keeping the reviews consistent. Making the team
larger makes that harder.

If we did a better job of discussing core disagreements more in the
nova-meeting, maybe that would help keep consistency between a larger
group of people. But it boils down to trusting each other, and a group
bigger than 20, is a lot of people to get to know.

>> I fear the only solution is smaller groups being experts on smaller
>> codebases. There is less to review, and more candidates that are likely
>> to be experts in this limited area.
>> Applied to Nova, that means modularization -- having strong internal
>> interfaces and trusting subteams to +2 the code they are experts on.
>> Maybe VMWare driver people should just +2 VMware-related code. We've had
>> that discussion before, and I know there is a dangerous potential
>> quality slope there -- I just fail to see any other solution to bring
>> that 750/21=36 figure down to a bearable level, before we burn out all
>> of the Nova core team.

This worked really well for Cinder, and I hope Gantt will do the same
kind of thing for Scheduling.

It certainly feels like we really need to split things up, maybe:
* API (talks to compute api to creates tasks and gets objects)
* core task orchestration and persistence (compute api, db objects,
conductor, talks to compute manager api, scheduler api, network api)
* compute manager + "drivers" (gets instance objects)
* Scheduling (models resources, gets )
* nova-network

But clearly, that will make evolving those interfaces much harder, the
separate they become.

Certainly we fee a few release away from some of those splits.

> I broadly agree - I think that unless Nova moves more towards something
> that is closer to the Linux style subsystem maintainer model we are
> doomed. I know in Linux, the maintainers actually use separate git trees,
> and that isn't what I mean - I think using a single git tree is still
> desirable (at least for now). What I mean is that we should place more
> trust on the opinion of the people who are experts for a particular
> area of code. Let those experts take on a greater burden of the code
> review so core team can put more focus on actual merge approval.
> I know some of the core team try to do this implicitly - eg we know who
> some of the main people involved in hyperv or vmware are, so will tend
> to treat their +1 as an effective +2 from the POV of their driver code,
> but our rules still require two actual +2s from core, so it doesn't
> entirely help us right now. I think we need to do some work in tooling
> to make this more of an explicit process though.

I do prefer a distinction between core and sub-core-team.

Just because we want multiple sub-core-team members, but still make it
easier to spot the core reviewer.

> The problem is that gerrit does not allow us to say person X has +2 for
> code that touches directory path /foo/bar. The +2 is global to the entire
> repository. We could try to deal with this problem outside of gerrit
> though. As a starting point, each virt driver (or major functional area
> of nova codebase) should have an explicit list of people who are considered
> to be the "core team" for that area of code.  From such a list, tools like
> gerrymander (or anything else that can query gerrit), could see when a
> person in those lists +1'd a change touching their area of responsibility
> and change that to be presented as a "+1.5".

Or we create a +3 for core, and +2 for sub team, but I like the distinction.

> This would make it very explicit to the reviewers that they should consider
> the change to be (almost) equivalent to a +2.  We could potentially then
> relax the rule of
>      "+A requires two +2s"
> to be
>      "+A requires (two +2s) or (one +2 and one +1.5)"

I think that could work, but maybe:

     "+A requires (two +2s) or (one +2 and two +1.5)"

Only because it would be good for the sub-core teams to see each
others +1.5s. But maybe thats silly, and just makes things worse

> I think this would significantly improve our review throughput. I think
> it could also help us get people to gain greater responsibility. The
> jump from regular contributor to core team member is quite a high bar.
> If we had the intermediate step of subsystem team member that would ease
> the progression. It would also give the subsystem teams a greater sense
> of engagement & value in the nova community

This keeps being suggested. I always used to worry about this because
of things like:

* code consistency is easier when more reviews know more of nova

* makes it harder to get the natural alignment of core reviews, as we
would less regularly see each others code reviews

* almost encourages people to work in a silo of nova, where ideally we
want help in the core of nova too

* would be bad if people got stuck in one part of nova by this, even
if they wanted to go work in other areas of nova

But honestly, we are running out of options here. Maybe we should try this out?

It is risky (in terms of keeping the code base consistent), but we do
need to scale out our review capacity. I am wondering what other ideas
people have...?


More information about the OpenStack-dev mailing list