[openstack-dev] [Nova] The unbearable lightness of specs

John Garbutt john at johngarbutt.com
Thu Jun 25 11:09:07 UTC 2015

Apologies to go back in time in this thread, but I feel like I should
respond directly to this original email...

On 24 June 2015 at 11:28, Nikola Đipanov <ndipanov at redhat.com> wrote:
> Hey Nova,
> I'll cut to the chase and keep this email short for brevity and clarity:
> Specs don't work!

In many cases, specs really don't work.
But in a lot of cases they have proved a useful record of the decision.

We got feedback that contributors wanted a way to talk about the
direction of where they were going, and get some level on consensus
around that, before starting to code. I never intend to enforce that
use, but its a useful tool for that.

> They do nothing to facilitate good design happening,
> if anything they prevent it.

They are a great tool to record that consensus was reached on a thorny
issue, and more importantly, record why that decision was made.
Because we record the why, we can revisit that decision in a more
informed way.

Now there are many examples of where specs just go off into stupid
details that would be best reviewed in code. When that happens, its a
total waste of everyones time.

> The process layered on top with only a
> minority (!) of cores being able to approve them, yet they are a prereq
> of getting any work done, makes sure that the absolute minimum that
> people can get away with will be proposed.

The idea of a smaller set, is because there are less reviews. But now
we can't keep up with code or spec reviews.

My preference is to have nova-core with +2/-2, with a smaller group
with +W just to help ensure consistency within the smaller set of
reviews. That was my proposed first step towards changing things. I
think we need to revisit not going that direction. Its possible we
want all nova-core to have +W, but I like the idea of trying with a
smaller group to check for consistency, mostly as a middle group.

> This in turn goes and
> guarantees that no good design collaboration will happen.

The spec just records a consensus vote. The fact there is a vote can
spark debate.

Minimal specs are my preference. You only include what you think needs
to be agreed before you start coding. In the case where there is
nothing much to talk about we don't require a spec.

> To add insult
> to injury, Gerrit and our spec template are a horrible tool for
> discussing design.

Its an order of magnitude better than the launchpad blueprint
whiteboard, and to some extent, linked wiki pages.

It was designed as a stop gap measure, but no better alternative has
come forward at this point.

> Also the spec format itself works for only a small
> subset of design problems Nova development is faced with.

Help evolving that is very welcome.

The idea of the current format is you can compare specs side by side,
and different readers can look at the section that interests them the

Also many of the sections were added to make sure people thought about
that aspect, and can explicitly state there is no impact, and that
area had been considered.

I kinda hope there is a more efficient way of doing the template. One
idea I have is just to have a second template that just starts out
with None in every section, so you have something to cut and paste
from. I kinda hope someone comes up with a better idea.

> That's only a subset of problems. Some more, you ask? OK. No clear
> guidelines as to what needs a spec, that defaults to "everything does".
> And spec being the absolute worst way to judge the validity of some of
> the things that do require them.

We tried to create this rule of thumb:

The harder problem, I think, is how much detail is required.
The spec template tries to cover that (as of kilo it changed a lot):

For spec-less blueprints, the rule has kinda become, if we can't
quickly agree on it in the Nova meeting, then it probably needs a
spec. Its quite hard to qualify that, and its probably not the best
rule either.

Do we need to do better, yes we do. Please help.

> Examples of the above are everywhere if you care to look for them but
> some that I've hit _this week_ are [1] (spec for a quick and dirty fix?!
> really?!)

The reason its a spec is because it makes a REST API change.

Even if it didn't, its a good way of agreeing (or not) if its a good direction.

> [2] (spec stuck waiting for a single person to comment
> something that is an implementation detail, and to make matter worse the
> spec is for a bug fix)

Not sure why its stuck, but if something is stuck:
* ping the person on IRC
* if no response raise it at the nova-meeting in the relevant stuck
review section

> [3] (see how ill suited the format is for a
> discussion + complaints about grammar and spelling instead of actual
> point being made).

Pushing a follow on change with the fixes, or uploading a new patch
set has been suggested.

Personally my dyslexia means I just don't see most of those problems
when I read through things.

> Nova's problem is not that it's big, it's that it's big _and_ tightly
> coupled. This means no one can be trusted to navigate the mess
> successfully, so we add the process to stop them. What we should be
> doing is fixing the mess, and the very process is preventing that.

The priorities are trying to make sure we can focus on fixing that mess:

I am actively trying to get folks to write up ideas, so we can get
more people involved fixing this technical debt:

If people are blocked by some required refactoring, they should be
able to help out with that. We have historically done a bad job of
that, we must do a lot better at that to scale out our development

> Don't take my word for it - ask the Gantt subteam who have been trying
> to figure out the scheduler interface for almost 4 cycles now. Folks
> doing Cells might have a comment on this too.
> The good news is that we have a lot of stuff in place already to help us
> reduce this massive coupling of everything. We have versioned objects,
> we have versioned RPC. Our internal APIs are terrible, and provide no
> isolation, but we have the means to iterate and figure it out.

Keeping our user expectations (like live upgrades, REST API stablity,
strong ecosystem around the API, etc) and evolving so much of the code
base is really hard, but we are making progress.

I mean this in the nicest way, but nova was all written in a weekend,
then had lots of features and abstractions shoved into it, then we
added some bug fixes, then got loads of users running it in production
who needed to live-upgrade to get the new bug fixes.

We are making great progress un-tying that resultant spaghetti.

The fear of adding to that mess, and being left to go fix that, is
driver some of the good and a lot of the bad bits of the current spec

> I don't expect this process issues will get solved quickly, If it were
> up to me I'd drop the whole thing, but I understand that it's not how
> it's done.

Its important to unlearn bad habits.

But I feel we are still in a better place than a no-specs world.

> I do hope this makes people think, discuss

Its important.

My current goal is to be clear on WHY we do what we are doing, so the
debate will be more productive.

Also being clear on what we doing, might also be useful.

> and move things into the
> direction of facilitating quality software development instead of
> outright preventing it.

That has always been the aim here.

We are a large group of professional software developers trying to do
our best, and trying to collaborate in an efficient and open way.

Can we do better. I sure hope we can.

> I'll follow up with some ideas on how to go
> forward once a few people have commented back.

Yes, although email is a really bad medium for this debate. I feel a
video conference to hash out a proposal to discuss at the midcycle
could help us, and bring back a concrete proposal to the ML. Happy to
organise something for those who are interested in trying to put
together a proposal.


More information about the OpenStack-dev mailing list