[openstack-dev] [nova] An analysis of code review in Nova

Shawn Hartsock hartsock at acm.org
Sat Mar 22 10:19:21 UTC 2014


On Fri, Mar 14, 2014 at 6:58 PM, Dan Smith <dms at danplanet.com> wrote:
> Review latency will be directly affected by how good the refactoring
> changes are staged. If they are small, on-topic and easy to validate,
> they will go quickly. They should be linearized unless there are some
> places where multiple sequences of changes make sense (i.e. refactoring
> a single file that results in no changes required to others).


I'm going to bring this to the next
https://wiki.openstack.org/wiki/Meetings/VMwareAPI we can start
working on how we'll set the order for this kind of work. Currently we
have a whole blueprint for refactoring a single method. That seems
silly. I'll want to come up with a plan around how to restructure the
driver so we can avoid some of the messes we've seen in the past.

I want to avoid one big refactor effort that drags on, but I also want
to address bigger problems we have inside the driver. For example,
vm_util.py seems to have become burdened with work that it shouldn't
have. It also performs a great number of unnecessary round trips using
a vm_ref to pull individual vm details over one at a time. Introducing
a VirtualMachine object that held all these references would simplify
some operations (I'm not the first person to suggest this and it
wasn't novel to me either when it was presented.)

It would seem Juno-1 would be the time to make these changes and we
need to serialize this work to keep reviewers from losing their
marbles trying to track it all. I would like to work out a plan for
this in conjunction with interested core-reviewers who would be
willing to more or less sponsor this work. Because of the problems
Matt points out, I don't want to tackle this in a haphazard or
piece-meal way since it could completely disrupt any new blueprint
work people may have targeted for Juno.

Having said that, on this driver, new blueprints in the last several
cycles have introduced serious feature regressions. Several minor bug
fixes have altered and introduced key architectural components that
have broken multiple critical features. In my professional opinion
this has a root cause based on the drivers tightly coupled and
non-cohesive internal design.

The driver design is tightly coupled in that a change in one area
forces multiple updates and changes *throughout* the rest of the
driver. This is true in testing as well. The testing design often
requires you to trace the entire codebase if you add a single optional
parameter to a single method. This does not have to be true.

The driver design is non-cohesive in that important details and
related information is spread throughout the driver. You must be aware
at all times (for example) whether or not your current operation
requires you to check if your vm_ref is outdated (we just worked on
several last minute critical bugs for RC1 where myself and others
pulled all nighters to fix the issue in a bad case of Heroic
Programming).

I would like to stop the http://c2.com/cgi/wiki?CodeVomit please. May we?

-- 
# Shawn.Hartsock - twitter: @hartsock - plus.google.com/+ShawnHartsock



More information about the OpenStack-dev mailing list