Open Stack

Mon Aug 19 15:47:32 UTC 2013

In this thread about code review:

  http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html

I mentioned that I thought there were too many blueprints created without
sufficient supporting design information and were being used for "tickbox"
process compliance only. I based this assertion on a gut feeling I have
from experiance in reviewing.

To try and get a handle on whether there is truely a problem, I used the
launchpadlib API to extract some data on blueprints [1].

In particular I was interested in seeing:

  - What portion of blueprints have an URL containing an associated
    design doc,

  - How long the descriptive text was in typical blueprints

  - Whether a blueprint was created before or after the dev period
    started for that major release.

The first two items are easy to get data on. On the second point, I redid
line wrapping on description text to normalize the line count across all
blueprints. This is because many blueprints had all their text on one
giant long line, which would skew results. I thus wrapped all blueprints
at 70 characters.

The blueprint creation date vs release cycle dev start date is a little
harder. I inferred the start date of each release, by using the end date
of the previous release. This is probably a little out but hopefully not
by enough to totally invalidate the usefulness of the stats below. Below,
"Early" means created before start of devel, "Late" means created after
the start of devel period.

The data for the last 3 releases is:

  Series: folsom
    Specs: 178
    Specs (no URL): 144
    Specs (w/ URL): 34
    Specs (Early): 38
    Specs (Late): 140
    Average lines: 5
    Average words: 55

  Series: grizzly
    Specs: 227
    Specs (no URL): 175
    Specs (w/ URL): 52
    Specs (Early): 42
    Specs (Late): 185
    Average lines: 5
    Average words: 56

  Series: havana
    Specs: 415
    Specs (no URL): 336
    Specs (w/ URL): 79
    Specs (Early): 117
    Specs (Late): 298
    Average lines: 6
    Average words: 68

Looking at this data there are 4 key take away points

  - We're creating more blueprints in every release.

  - Less than 1 in 4 blueprints has a link to a design document. 

  - The description text for blueprints is consistently short
    (6 lines) across releases.

  - Less than 1 in 4 blueprints is created before the devel
    period starts for a release.

You can view the full data set + the script to generate the
data which you can look at to see if I made any logic mistakes:

  http://berrange.fedorapeople.org/openstack-blueprints/

There's only so much you can infer from stats like this, but IMHO think the
stats show that we ought to think about how well we are using blueprints as
design / feature approval / planning tools.

That 3 in 4 blueprint lack any link to a design doc and have only 6 lines of
text description, is a cause for concern IMHO. The blueprints should be giving
code reviewers useful background on the motivation of the dev work & any
design planning that took place. While there are no doubt some simple features
where 6 lines of text is sufficient info in the blueprint, I don't think that
holds true for the majority.

In addition to helping code reviewers, the blueprints are also arguably a
source of info for QA people testing OpenStack and for the docs teams
documenting new features in each release. I'm not convinced that there is
enough info in many of the blueprints to be of use to QA / docs people.

The creation dates of the blueprints are also an interesting data point.
If the design summit is our place for reviewing blueprints and 3 in 4
blueprints in a release are created after the summit, that's alot of
blueprints potentially missing summit discussions. On the other hand many
blueprints will have corresponding discussions on mailing lists too,
which is arguably just as good, or even better than, summit discussions.

Based on the creation dates though & terseness of design info, I think
there is a valid concern here that blueprints are being created just for
reason of "tickbox" process compliance. 

In theory we have an approval process for blueprints, but are we ever
rejecting code submissions for blueprints which are not yet approved ?
I've only noticed that happen a couple of times in Nova for things that
were pretty clearly controversial.

I don't intend to suggest that we have strict rules that all blueprints
must be min X lines of text, or be created by date Y. It is important
to keep the flexibility there to avoid development being drowned in
process without benefits.

I do think we have scope for being more rigourous in our review of
blueprints, asking people to expand on the design info associated with
a blueprint. Perhaps also require that a blueprint is actually approved
by the core team before we go to the trouble of reviewing & approving
the code implementing a blueprint in Gerrit.

Regards,
Daniel

[1] http://berrange.fedorapeople.org/openstack-blueprints/blueprint.py
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

Open Stack

[openstack-dev] Stats on blueprint design info / creation times

OpenStack

Community

Documentation

Branding & Legal