[openstack-dev] [Nova][Keystone] The unbearable lightness of specs

Daniel P. Berrange berrange at redhat.com
Thu Jun 25 09:19:36 UTC 2015

On Wed, Jun 24, 2015 at 06:30:06PM -0700, James Bottomley wrote:
> On Wed, 2015-06-24 at 17:25 +0100, Daniel P. Berrange wrote:
> > On Wed, Jun 24, 2015 at 11:52:37AM -0400, Adam Young wrote:
> > > On 06/24/2015 06:28 AM, Nikola Đipanov wrote:
> > > >Gerrit and our spec template are a horrible tool for
> > > >discussing design.
> > > This is the heart of the problem.
> > > 
> > > 
> > > I think that a proper RFE description in the bug tracker is the best place
> > > to start.  Not a design of the solution, but a statement of the problem.
> > > 
> > > Then, the rest of the discussion should take place in the code. Keystoen has
> > > the Docs right in the code, as do, I think, every other project.  Don't sign
> > > off on a patch for a major feature unless the docs have been updated to
> > > explain that feature.  It will keep us from bike shedding about Database
> > > schemas.
> > 
> > What you are describing is sounds like the situation that existed before
> > the specs concept was introduced. We had a statement of problem in the
> > blueprint, and then people argued over the design in the code reviews.
> > It really didn't work at all - code reviews are too late in the workflow
> > to start discussions around the design, as people are already invested
> > in dev work at that point and get very upset when you then tell them
> > to throw away their work. Which happened repeatedly. You could say that
> > the first patch submitted to the code repository should simply be a doc
> > file addition, that describes the feature proposal and we should discuss
> > that before then submitting code patches, but then that's essentially
> > just the specs again, but with the spec doc in the main nova.git instead
> > of nova-specs.git.
> I don't think you meant this as a don't do it, just an experience point:
> you're saying *we* couldn't make the process work, but that doesn't mean
> that the process itself (specless code reviews) can't be made to work.
> It works fine for a lot of open source projects, so the process must be
> workable.  However, what the experience has shown is that there's a
> bottleneck which isn't removed simply by removing the spec process.
> On the spec question, the reason projects like the Linux Kernel are so
> code driven is that with code it's harder to block a submission on
> esoteric grounds, whereas with no code it is easier to argue endlessly
> about minutiae.  I think this might be the reason for the "lightness"
> Nikola complains about: the less you put in a spec, the less reason
> people have to weigh in on it and delay its approval.  Perhaps an
> appropriate question is: "is that fear well founded or just anecdotal?"

While I agree with the broad sentiment, I feel Linux has some flexibility
in code design that Nova & openstack projects in general do not. When it
comes to additions to the userspace API, the Linux design debates are just
as long and detailed as I've seen in Nova, for the reason that once you
decide in a userspace ABI you commit to it forever. Now in Nova, there
are quite alot of areas where we have such considerations - the RPC
interface between Nova services is a stable ABI, the definition of
the data structures passed over RPC is a stable ABI, the rest interface
is a stable ABI, the controls (image props & flavor extra specs) are
long term APIs.

In general with Linux, as long as you are not touching userspace ABI,
you have alot more flexibility to accept design mistakes, as they can
be corrected at will later without back compatibility issues in general.
With Nova, a very very large portion (more than the the majority IME)
of new features are impacting on the public API or ABIs that we have
to maintain long term. We've done a pretty awful job at getting those
right in the past, and this is one of the things that the specs process
has allowed us to significantly improve on.  This has certainly resulted
in long drawn out design decisions on specs which has delayed code, but
we have to be careful to balance the need for correct ABI design, with
the need to not artificially delay developers.

I don't think we have that balance quite right, as we do tend to delay
stuff more than is needed, partly due to review bandwidth constraints.

> If I look at what the above says about the main issue that doesn't get
> solved by removing the spec process, it's review pressure: how do you
> increase the throughput of approval/rejection.  Note I didn't advocate
> more reviews: The metric for success should be time to resolution
> (accept/reject), not the number of reviews, if we ask everyone to double
> their time spent reviewing and the net result of this is that the
> average number of reviews to resolution doubles, nothing is gained.  So
> perhaps it's time to actually measure the number of reviews to
> resolution and look at ways to reduce this metric.  That will make the
> available current review bandwidth go further and reduce the time to
> actual resolution without anyone having to do more work.  The low
> hanging fruit in this might be the obvious patches: obvious accept
> (simple bug fix) or obvious reject (bogus code) should go through after
> one review: do they?  Perhaps one other is wasting core time: after a
> couple of failed reviews by people who could +2 the thing, is it time to
> reject?  And perhaps, finally, there should be a maximum number of
> reviews before an automatic reject?
> Sorry for derailing the argument, but I do still see review bandwidth as
> a key issue behind all the problems.

Not answering your questions, but on a slightly related note.

There has also been alot of work in infrastructure to help us when
dealing with API design decisions. For example, in the past, most
internal objects were just undocumented dicts, so virt drivers often
added / used random stuff without most people realizing we had just
committed to a new ABI constraint. With the Nova versioned objects
framework, we've gained alot of clarity over object design, which
has taken some of the pressure off review of virt driver code - any
ABI changes are centralized in nova/objects/*.py so we can easily
identify them. We've done similar formalization of REST API versioning
and trying to formalize other areas of the code. Ultimately this should
have a benefit on our ability to delegate reviews in areas of the code,
and finally enable a goal of mine which is to allow virt driver teams
independance in dealing reviewing & merging with code in their drivers.
This should also let us be more pragmatic in deciding which features
will need detailed review of specs upfront, vs which code be be done
via a more light weight process with little/no specs

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

More information about the OpenStack-dev mailing list