[openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility

Robert Collins robertc at robertcollins.net
Sun Dec 29 06:50:25 UTC 2013

On 29 December 2013 04:21, Day, Phil <philip.day at hp.com> wrote:
> Hi Folks,
> I know it may seem odd to be arguing for slowing down a part of the review
> process, but I’d like to float the idea that there should be a minimum
> review period for patches that change existing functionality in a way that
> isn’t backwards compatible.

What is the minimum review period intended to accomplish? I mean:
everyone that reviewed this *knew* it changed a default, and that
guest OS's that did support ext3 but don't support ext4 would be
broken. Would you like to have seen a different judgement call - e.g.
'Because this is a backward breaking change, it has to go through one
release of deprecation warning, and *then* can be made' ?

One possible reason to want a different judgment call is that the
logic about impacted OS's was wrong - I claimed (correctly) that every
OS has support for ext4, but neglected to consider the 13 year
lifespan of RHEL...
https://access.redhat.com/site/support/policy/updates/errata/ shows
that RHEL 3 and 4 are both still supported, and neither support ext4.
So folk that are running apps in those legacy environments indeed
cannot move.

Another possible reason is that we should have a strict
no-exceptions-by-default approach to backwards incompatible changes,
even when there are config settings to override them. Whatever the nub
is - lets surface that and target it.

Basically, I'm not sure what problem you're trying to solve - lets
tease that out, and then talk about how to solve it. "Backwards
incompatible change landed" might be the problem - but since every
reviewer knew it, having a longer review period is clearly not
connected to solving the problem :).

> The specific change that got me thinking about this is
> https://review.openstack.org/#/c/63209/ which changes the default fs type
> from ext3 to ext4.    I agree with the comments in the commit message that
> ext4 is a much better filesystem, and it probably does make sense to move to
> that as the new default at some point, however there are some old OS’s that
> may still be in use that don’t support ext4.  By making this change to the

Per above, these seem to be solely RHEL3 and RHEL4.

> default without any significant notification period this change has the
> potential to brake existing images and snapshots.  It was already possible
> to use ext4 via existing configuration values, so there was no urgency to
> this change (and no urgency implied in the commit messages, which is neither
> a bug or blueprint).

Indeed - the reason for putting up the change was the positive
reception on the list. If the change was requested to wait, we would
have ensured there was a bug (because running non-default for no good
reason is a bug in TripleO+the component whose default is wrong) used
the config setting, and moved on.

> I’m not trying to pick out the folks involved in this change in particular,

I don't feel picked out :).

> it just happened to serve as a good and convenient example of something that
> I think we need to be more aware of and think about having some specific
> policy around.  On the plus side the reviewers did say they would wait 24
> hours to see if anyone objected, and the actual review went over 4 days –
> but I’d suggest that is still far too quick even in a non-holiday period for
> something which is low priority (the functionality could already be achieved
> via existing configuration options) and which is a change in default
> behaviour.  (In the period around a major holiday there probable needs to be
> an even longer wait).     I know there are those that don’t want to see
> blueprints for every minor functional change to the system, but maybe this
> is a case where a blueprint being proposed and reviewed may have caught the
> impact of the change.    With a number of people now using a continual

I'm extremely skeptical of 'wait longer' and 'use blueprints' as tools
to get 'big impact noticed': blueprints will get the PTL to see the
change, but not all reviewers. Waiting longer likewise: One could wait
3 weeks and not get reviewed. If the existing system isn't working,
doing more of it will just not work more :).

> deployment approach any change in default behaviour needs to be considered
> not just  for the benefits it brings but what it might break.  The advantage
> we have as a community is that there are lot of different perspectives that
> can be brought to bear on the impact of functional changes, but we equally
> have to make sure there is sufficient time for those perspectives to emerge.

Sure. What does this break though? Specifically, from this mail and
the research it's prompted me to do I can see that RHEL3 and RHEL4
users would stop having ephemeral work *by default*. They can still
work by requesting an ext3 drive from their cloud provider.

> Somehow it feels that we’re getting the priorities on reviews wrong when a
> low priority changes like this which can  go through in a matter of days,
> when there are bug fixes such as https://review.openstack.org/#/c/57708/
> which have been sitting for over a month with a number of +1’s which don’t
> seem to be making any progress.

We could talk about a strict FIFO approach to reviews, or a
drive-WIP-to-zero approach, but I think that such a discussion is best
separated out from the changes-like-this discussion, because there
will also be backwards incompatible changes which *are* urgent or
whatever, and so these are clearly different things to ensure our
systems cope with. IMNSHO.


Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

More information about the OpenStack-dev mailing list