[openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility

Sean Dague sean at dague.net
Thu Jan 2 00:37:44 UTC 2014


On 01/01/2014 02:16 PM, Day, Phil wrote:
<snip>
 >> So you are really making 2 assumptions here that aren't valid:
>>    * it was known to be a backwards compatibility problem - because it wasn't,
>> and no one provided feedback over the course of 4 days to indicate that it
>> was (there were some alternative implementation ideas, but no one said
>> "hold on, this could break X")
>
> That's not really an assumption -  I'm saying it would have been spotted had there been more reviews because I know would have spotted it, as I think would some of my collegues.   We've been looking at a number of problems with backing files recently, including the security  fix we posted a week or so ago, and have been working on how to configure some of our images to use ext4 - so it was an area that was fresh in our minds.
>
>>    * it wasn't urgent - and like I said the tone really was towards this being an
>> important need for trippleo
>>
>
> I'm assuming (or maybe asserting would be a better term) that it wasn't urgent because:
>
> - Nothing in the commit message suggests its urgent - it describes it more as a major performance enhancement
>
> - Its not linked to a high priority bug or approved blueprint
>
> - The capability to use ext4 already existed via configuration options.  If this was needed for the TripelO gate tests (and again no mention of this in the commit itself) then why couldn’t the tests of been configured to use ext4 rather than changing the default ?
>
> So no sorry - I'm not willing yet to back away from my assumption that this wasn't an urgently needed fix. especially because of the last point.
>
> It may have been the tone in IRC, but should I really be expected to look there to understand the background to a submitted change ?

Again, I think in this case we did make a mistake, but I also think we'd 
have made the same mistake given the same data in the future. So I don't 
think review policy changes are the fix here.

Again, mostly opinion there. Backed up with a few footnotes in the 
response to Tim's post on this thread.

>> Which is why I don't think saying 3 week minimum review time is helpful,
>> because if any of the reviewers had thought it was backwards incompatible,
>> they would have -1ed it. Humans only get so much of that right.
>>
>
> I think I agree that a time limit is the wrong way to go - as in this case the number of active reviewers is very variable throughout the year.   Something which was more around number and type of reviewers (and maybe we need a distinction now that goes beyond just core and non-core) would probably be more workable.
>
>
>> So we need to handle this defensively and beef up the automated systems.
>> Does that require more people adding in defensive tests to those scenarios?
>> definitely. The fact that right now basically Dean and I are the only ones that
>> understand our upgrade testing (with a few others
>> bootstrapping) is way too few people touching that problem.
>>
>> If seemless upgrade is the most important thing to people, we need more
>> people in the trenches on the tooling that helps us verify that.
>>
>> 	-Sean
> As I said before, I'm not arguing against improving the automated tests, but of course we also have to be cognisant that extending the time take to run tests is in itself a problem.   And however much effort we can divert into adding tests it will take time before that gets us improved coverage of this kind of guest / upgrade issue.

Gate time is not a problem, we can run this matrix in parallel.

My complaint here is that "however much effort" is basically largely 0 
right now. I know, I'm reviewing and trying to help land what little 
effort is there.

I'd probably be a little more positive on looking at non code based 
solutions here if there was significant effort on the code based 
solutions, and we still had gaps. The fact that there isn't, makes me 
pretty negative on policy solutions.

> In the meantime as you and Rob have pointed out the review process clearly isn’t working as well as we'd all like it to, with some changes getting no reviews and others some sitting for weeks with many +1's (for example here's a bug fix with seven +1's  that has been stuck since 6th Dec https://review.openstack.org/#/c/56258/ ) - none of which seems to be really correlated with the relative priority of the proposed changes.  So I'm suggest that that we look at what can be done to improve the review process now while the automated testing is improved.

Has it been brought up in the Nova meeting? That's usually a good place 
to bring up bug fixes that fall through the cracks.

	-Sean

-- 
Sean Dague
Samsung Research America
sean at dague.net / sean.dague at samsung.com
http://dague.net



More information about the OpenStack-dev mailing list