[openstack-dev] [nova] An analysis of code review in Nova
mriedem at linux.vnet.ibm.com
Wed Mar 12 18:28:23 UTC 2014
On 2/25/2014 6:36 AM, Matthew Booth wrote:
> I'm new to Nova. After some frustration with the review process,
> specifically in the VMware driver, I decided to try to visualise how the
> review process is working across Nova. To that end, I've created 2
> graphs, both attached to this mail.
> Both graphs show a nova directory tree pruned at the point that a
> directory contains less than 2% of total LOCs. Additionally, /tests and
> /locale are pruned as they make the resulting graph much busier without
> adding a great deal of useful information. The data for both graphs was
> generated from the most recent 1000 changes in gerrit on Monday 24th Feb
> 2014. This includes all pending changes, just over 500, and just under
> 500 recently merged changes.
> pending.svg shows the percentage of LOCs which have an outstanding
> change against them. This is one measure of how hard it is to write new
> code in Nova.
> merged.svg shows the average length of time between the
> ultimately-accepted version of a change being pushed and being approved.
> Note that there are inaccuracies in these graphs, but they should be
> mostly good. Details of generation here:
> https://github.com/mdbooth/heatmap. This code is obviously
> single-purpose, but is free for re-use if anyone feels so inclined.
> The first graph above (pending.svg) is the one I was most interested in,
> and shows exactly what I expected it to. Note the size of 'vmwareapi'.
> If you check out Nova master, 24% of the vmwareapi driver has an
> outstanding change against it. It is practically impossible to write new
> code in vmwareapi without stomping on an oustanding patch. Compare that
> to the libvirt driver at a much healthier 3%.
> The second graph (merged.svg) is an attempt to look at why that is.
> Again comparing the VMware driver with the libvirt we can see that at 12
> days, it takes much longer for a change to be approved in the VMware
> driver than in the libvirt driver. I suspect that this isn't the whole
> story, which is likely a combination of a much longer review time with
> very active development.
> What's the impact of this? As I said above, it obviously makes it very
> hard to come in as a new developer of the VMware driver when almost a
> quarter of it has been rewritten, but you can't see it. I am very new to
> this and others should validate my conclusions, but I also believe this
> is having a detrimental impact to code quality. Remember that the above
> 12 day approval is only the time for the final version to be approved.
> If a change goes through multiple versions, each of those also has an
> increased review period, meaning that the time from first submission to
> final inclusion is typically very, very protracted. The VMware driver
> has its fair share of high priority issues and functionality gaps, and
> the developers are motived to get it in the best possible shape as
> quickly as possible. However, it is my impression that when problems
> stem from structural issues, the developers choose to add metaphorical
> gaffer tape rather than fix them, because fixing both creates a
> dependency chain which pushes the user-visible fix months into the
> future. In this respect the review process is dysfunctional, and is
> actively detrimental to code quality.
> Unfortunately I'm not yet sufficiently familiar with the project to
> offer a solution. A core reviewer who regularly looks at it is an
> obvious fix. A less obvious fix might involve a process which allows
> developers to work on a fork which is periodically merged, rather like
> the kernel.
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
When I originally read this I had some ideas in mind for a response
regarding review latency with the vmware driver patches, but felt like
anything I said, albeit what I consider honest, would sound
bad/offensive in some way, and didn't want to convey that.
But this came up in IRC today:
That spurred some discussion around this same topic and I think
highlights one of the major issues, which is code quality and the design
of the driver.
For example, the driver's spawn method is huge and there are a lot of
nested methods within it. There are a lot of vmware patches and a lot
of blueprints, and a lot of them touch spawn. When I'm reviewing them,
I'm looking for new conditions and checking to see if those are unit
tested (positive/error cases) and a lot of the time I find it very
difficult to tell if they are or not. I think a lot of that has to do
with how the vmware tests are scaffolded with fakes to simulate a
vcenter backend, but it makes it very difficult if you're not extremely
familiar with that code to know if something is covered or not.
And I've actually asked in bp reviews before, 'you have this new/changed
condition, where is it tested' and the response is more or less 'we plan
on refactoring this later and will add unit tests then' or 'it's hard to
test this path without a lot of changes to how the tests are working'.
That's unacceptable to me, and I generally give up on the review after that.
So to move this all forward, I think that bp above should be top
priority for the vmware team in Juno to keep bp patches moving at the
pace they do, because the features and refactoring just keeps coming and
at least for me it's very hard to burn out on looking at those reviews.
More information about the OpenStack-dev