[openstack-dev] Continuous deployment - significant process change

Robert Collins robertc at robertcollins.net
Tue Apr 30 01:36:31 UTC 2013


On 30 April 2013 12:11, Michael Still <mikal at stillhq.com> wrote:
> On Tue, Apr 30, 2013 at 7:58 AM, Joe Gordon <jogo at cloudscaling.com> wrote:

>> This touches on a bigger issue, what is experimental and what is not.
>> Currently we don't do a good job of differentiating the two.  I think we
>> first need to flesh out what it means to be experimental for users and for
>> development.
>
>
>  I strongly disagree with landing features off by default. There are a few
> reasons for that:
>
>  - we've been moving away from "make it work" flags. This moves us in the
> other direction -- we proliferate flags, many of which might be required to
> make nova work well in real deployments. These extra flags also make the
> codebase more complicated, and therefore less safe to develop in.

No, we covered that in the session. Sadly it looks like it wasn't
captured in the Etherpad.
You've got two issues here:
 - more options in config files (and how do we deprecate them etc)
 - code complexity.

For the former, the options would *only* be needed for early adopters.
These are folk that would otherwise be running an integration branch
where they merge the patch series + trunk : and thats a pain because
it makes it harder to report bugs about the evolving feature. When the
feature is declared GA it would just be enabled by changing the
default value for the option, and once we're confident in the feature,
the option and any code conditionals would be removed.

Code complexity: Yes, the code in trunk would be made a little more
complex. However that complexity is short lived - it lives for the
duration that the feature is considered experimental, which is about
the length of time of development of the feature + however much real
world experience we want.
In terms of safety for development, it is no less safe that it is
today, it's just that the risk is clearer: right now having a
many-item patch-series outstanding places that risk on anyone using
the feature branch, and makes changes to trunk be impossible to assess
for impact on that patch series.
This is strictly worse because it forces rework for everyone: the
people that land conflicting changes and the people working on the
feature branch. Changes that could have been done once properly are
now done two or three times.

Nova is about 70K LOC at the moment excluding tests. If we landed 20
features per month each taking 2 months to build/land/move out of
experimental we'd have 40 options outstanding at any point in time,
and even if there were 10 occurences of each conditional that would
only be 400 conditionals out of 8000, or 5%.

Personally, I think 5% increased complexity in exchange for far fewer
unpleasant surprises is a really good tradeoff.

>  - it results in untested code in the codebase (untested in the sense of
> undeployed by any real deployment). This means we end up with a false sense
> of security about the stability of that code until we eventually turn it on
> by default. Then we discover no one has used it before and that its totally
> broken.

We already have that. In fact it has less of that, because folk can
build on it much earlier.

> From my selfish upstream perspective, the purpose of CD is to help us find
> errors before they hit large numbers of deployments. That's why we should be
> so grateful to Rackspace, HP, or anyone else willing to run trunk -- that
> testing is invaluable and comes at a cost for the deployer. We should be
> doing everything we can to ensure brokenness is found early, even if that
> means slightly more pain for deployers.

Right, finding brokenness early is *what this is about*. Keeping
things out until they are perfect is a problem for Rackspace and HP.
If we're going to run trunk, we have to stop doing that.

-Rob
-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services



More information about the OpenStack-dev mailing list