[openstack-dev] [Nova] The unbearable lightness of specs

Nikola Đipanov ndipanov at redhat.com
Wed Jun 24 16:43:43 UTC 2015

On 06/24/2015 04:42 PM, Andrew Laski wrote:
> On 06/24/15 at 02:38pm, Nikola Đipanov wrote:
>> On 06/24/2015 01:42 PM, Daniel P. Berrange wrote:
>>> On Wed, Jun 24, 2015 at 11:28:59AM +0100, Nikola Đipanov wrote:
>>>> Hey Nova,
>>>> I'll cut to the chase and keep this email short for brevity and
>>>> clarity:
>>>> Specs don't work! They do nothing to facilitate good design happening,
>>>> if anything they prevent it. The process layered on top with only a
>>>> minority (!) of cores being able to approve them, yet they are a prereq
>>>> of getting any work done, makes sure that the absolute minimum that
>>>> people can get away with will be proposed. This in turn goes and
>>>> guarantees that no good design collaboration will happen. To add insult
>>>> to injury, Gerrit and our spec template are a horrible tool for
>>>> discussing design. Also the spec format itself works for only a small
>>>> subset of design problems Nova development is faced with.
>>> I'd like to see some actual evidence to backup a sweeping statement
>>> as "Specs dont work. They do nothing to facilitate good design
>>> happening,
>>> if anything they prevent it."
>>> Comparing Nova today, with Nova before specs were introduced, I think
>>> that specs have had a massive positive impact on the amount of good
>>> design and critique that is happening.
>>> Before specs, the average blueprint had no more than 3 lines of text
>>> in its description. Occassionally a blueprint would link to a wiki
>>> page or google doc with some design information, but that was very
>>> much the exception.
>>> When I was reviewing features in Nova before specs came along, I spent
>>> alot of time just trying to figure out what on earth the code was
>>> actually attempting to address, because there was rarely any statement
>>> of the problem being addressed, or any explanation of the design that
>>> motivated the code.  This made life hard for reviewers trying to figure
>>> out if the code was acceptable to merge.  It is pretty bad for
>>> contributors
>>> trying to implement new features too, as they could spend weeks or
>>> months
>>> writing and submitting code, only to be rejected at the end because the
>>> (lack of any design discussions) meant they missed some aspect of the
>>> problem which in turn meant all their work was in vain. That was a
>>> collosal
>>> waste of everyone's time and resulted in some of the very horrible code
>>> impl decisions we're still living with today.
>> Yes there are good reasons to have a place to discuss implementation
>> before doing it, especially if you are new to the project.
> This has been a huge boon.  And I don't think the benefits are realized
> only for developers new to the project.
> It seems to me that the challenge going forward is how to keep this
> benefit while reducing/eliminating some of the drawbacks.  But dropping
> specs without a replacement would eliminate both the benefit and the
> drawbacks.

I can agree with this. But...

The problem is that changing process can really be done only two times a
year. This is nowhere near flexible enough and once a bad process is in
place we seem to stick to it almost like it's a public API :) to Nova

This lack of flexibility due to communication overhead is why we should
err on the side of as little formal process as possible, which is the
opposite of what we've been doing.

>> The behemoth we have now is not that. It's miles away as you point out
>> below.
>>>> That's only a subset of problems. Some more, you ask? OK. No clear
>>>> guidelines as to what needs a spec, that defaults to "everything does".
>>>> And spec being the absolute worst way to judge the validity of some of
>>>> the things that do require them.
>>>> Examples of the above are everywhere if you care to look for them but
>>>> some that I've hit _this week_ are [1] (spec for a quick and dirty
>>>> fix?!
>>>> really?!) [2] (spec stuck waiting for a single person to comment
>>>> something that is an implementation detail, and to make matter worse
>>>> the
>>>> spec is for a bug fix) [3] (see how ill suited the format is for a
>>>> discussion + complaints about grammar and spelling instead of actual
>>>> point being made).
>>>> Nova's problem is not that it's big, it's that it's big _and_ tightly
>>>> coupled. This means no one can be trusted to navigate the mess
>>>> successfully, so we add the process to stop them. What we should be
>>>> doing is fixing the mess, and the very process is preventing that.
>>>> Don't take my word for it - ask the Gantt subteam who have been trying
>>>> to figure out the scheduler interface for almost 4 cycles now. Folks
>>>> doing Cells might have a comment on this too.
>>>> The good news is that we have a lot of stuff in place already to
>>>> help us
>>>> reduce this massive coupling of everything. We have versioned objects,
>>>> we have versioned RPC. Our internal APIs are terrible, and provide no
>>>> isolation, but we have the means to iterate and figure it out.
>>>> I don't expect this process issues will get solved quickly, If it were
>>>> up to me I'd drop the whole thing, but I understand that it's not how
>>>> it's done.
>>>> I do hope this makes people think, discuss and move things into the
>>>> direction of facilitating quality software development instead of
>>>> outright preventing it. I'll follow up with some ideas on how to go
>>>> forward once a few people have commented back.
>>> I will agree that the specs process has a number of flaws - in
>>> particular
>>> I think we've treated it as too much of a rigid process resulting it in
>>> being very beurocractic. In particular I think are missing an ability to
>>> be pragmmatic in decisions about the level of design required and
>>> whether
>>> specs are required. The idea of allowing blueprints without specs in
>>> some cases was an attempt to address this, but I don't feel it has been
>>> very successful - there is still too much stuff being forced through
>>> the specs process unncessarily imho.
> I haven't seen much, really anything, that's been forced through
> unnecessarily but I have seen a number of specs written for things that
> were petitioning for a specless blueprint.  I'm not sure why that's
> done, but perhaps it comes down to a fear of missing the spec deadline
> and being blocked for a cycle.

This is a huge part of the reason too, but not only.

If anything we should default to no spec, and only ask for a spec when
it's obvious that it's needed.

Currently though, you can't get the relevant people to looks at new BPs
in LP, but a lot of people get emailed when there's a new patch in the
specs repo. Hence - the default is to write a spec. It's very strange to
me that no one sees this as a problem.

>>> I've repeatedly stated that the fact that we created an even smaller
>>> clique of people to approve specs (nova-drivers which is a tiny subset
>>> of the already faaaar too small nova-core) is madness, as it creates
>>> an even worse review burden on them, and thus worsens the bottleneck
>>> than we already have.
> I agree that the number of people who can approve specs is too small at
> the moment.  But relatively few people are actually reviewing specs so
> building trust here is difficult.  And since I'm not seeing the entirety
> of nova-core doing spec reviews I don't think we should just give that
> entire group +2 ability on specs.

This kind of thinking is very strange to me. How are the
skills/trust/whatever different from what is expected of a person for
being core that +2s/+Ws patches? If anything reviewing the feature once
it's coded up and spotting the shortcomings requires both more work and
more familiarity with the code. You could argue that some people's vote
should count more when it comes to questions of architecture, but that's
true for a number of other things across the core team when it comes to
code reviews too.

Also the word "trust" that keeps getting mentioned in these discussions
is a blanket term that means AFAICT everything and nothing. Most of
current Nova cores have been involved with the project for several
years, and are well known by other cores for their work. If there's lack
of "trust" there about doing competent software engineering work, it's
either through not paying attention, or paranoia, or I don't understand
what the "trust" is supposed to mean here.

>>> In reviewing specs, we also often get into far too much detail about
>>> the actual implementation, which is really counterproductive. eg when
>>> get down to bikeshedding about the names of classes, types of object
>>> attributes, and so forth, it is really insane. That level of detail
>>> and bikesheeding belongs in the code review, not the design for the
>>> most part.
> Totally agree here.  The spec shouldn't constrain an implementation
> except in broad strokes, or places that are difficult to change after
> the fact.  I have asked people to remove implementation details from a
> spec for that reason.  But I have been asked to add details about
> database indexes when adding a new table, which I think is fair in order
> to avoid a second migration later.

DB indexes are exactly the implementation detail that does not need to
be on the spec IMHO. Without seeing the whole picture of how the data is
being used and queried, you can't properly decide. At that point you
should have a patch.

What we should do is place special review scrutiny on such patches. As
I've said before - not all reviews are made equal (some are way harder).
I have ranted before about our love of review counts, which are a very
bad metric, but that's OT here.

Bottom line - we can't process-away the need for tough code reviews.

>>> Specs are also inflexible when it comes to dealing with features that
>>> cross multiple projects, because we silo'd spec reviews against the
>>> individual projects. This is sub-optimal, but ultimately not a show
>>> stopper.
>>> I also think the way we couple spec approval & reviews to the dev
>>> cycles is counterproductive. We should be willing to accept and
>>> review specs at any point in any cycle, and once approved they should
>>> remain valid for a prolonged period of time - not require us to go
>>> through re-review every new dev cycle as again that's just creating
>>> extra burden. We should of course though reserve the right to unapprove
>>> specs if circumstances change, invlidating the design for the previous
>>> approval.
> I agree with this.  Though I do think we need to be realistic and say
> that spec reviews are a lower priority in later parts of a cycle.  As a
> reviewer I like the spec freeze and not getting pinged for spec reviews
> constantly, but it is a very big blocker for people whose spec doesn't
> get approved in a cycle.  I think we need a more middle ground.
>>> In short specs are far from perfect, but to say specs don't solve/help
>>> anything todo with design in nova is really ignoring our history from
>>> before the time specs existed. We must continue to improve our process
>>> overall the biggest thing we lost IMHO is agility and pragmatism in
>>> our decision making - I think we can regain that without throwing away
>>> the specs idea entirely.
> +1
>> Basically I agree with everything above.
>> I urge people to reply to this instead of my original email as the
>> writing is more detailed and balanced.
>> Thanks Daniel!
>> N.
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list