[openstack-dev] Continuous deployment - significant process change
Clint Byrum
clint at fewbar.com
Tue Apr 30 19:55:32 UTC 2013
On 2013-04-29 18:18, Robert Collins wrote:
> On 30 April 2013 10:48, Russell Bryant <rbryant at redhat.com> wrote:
>
>>>
>>> http://rc3.org/2013/03/03/dont-get-stuck/
>>
>> I think you can develop a feature in its entirety with a series of
>> reasonably sized changes with a well designed patch set.
>>
>> Once code gets in to the tree, it becomes a burden on the project to
>> maintain that code, even if disabled by default. If your feature
>> isn't
>> done, I don't want it in the tree.
>
> Lets separate out some concerns here.
>
> - pushing crap code, or code prematurely
> - waiting for a feature to be 'finished' before landing it.
>
> The former is a scary risk, and not something I want to see.
>
> The latter is - I put it to you - impossible. A -big chunk- of the
> reason for a freeze is because we recognize that features that are
> landing today are *not done*. They are *not finished*. The model of
> 'build a chunk of work and then when you're really happy land it' has
> a tonne of friction - moving dependencies, interactions with other
> changes, and the difficulties of production hardening - that make it a
> dream, not the actual reality.
>
> Lets look at what is being proposed in detail as a delta against our
> current published standards:
> - A single conceptual change would be allowed to land if it passes
> review, *even if it has no callers in the code base yet*.
> - Features that are building up towards completeness would have their
> code disabled by default, to prevent accidental impact on users.
> - As a consequence most new features would be able to be easily
> *disabled* upon release, in the event of issues.
>
> There is a related discussion, about the interaction between freezes
> and cramming, but I know that folk that haven't experienced the CD
> workflow and the qualitative change it brings will be very hesitant to
> embrace it, so I think we should leave that for the next cycle when
> the basics are in place.
>
I have to agree. What I have seen in most of the negative responses to
these changes is a lot of stable release dogma mixed in with some valid
concerns. So I feel like we need to compare the two processes. Lets
point at two hypothetical changes. One to allow multiple hypervisors in
Nova, and another to add support for a new hypervisor.
Stable release process:
* Refactor for multi-hypervisor is begun in feature branch.
* New hypervisor is begun in separate feature branch.
* Devs of both review eachothers' patches, but neither is willing to
merge because they don't want to block their development on landing the
other branch out of their control
* Users are encouraged to test each branch. Both branches are tested
minimally, and never tested together.
* Reviews begin on both in earnest. The smaller, simpler, more isolated
change, new hypervisor,lands.
* Now multi-hypervisor is forced to go in and refactor new hypervisor
to fit the refactor. Users who tested new hypervisor are asked to
re-test with multi-hypervisor.
* Feature freeze arrives, multi-hypervisor is kept out, or rushed in.
* Release happens with either un-tested multi-hypervisor rushed in, or
without multi-hypervisor.
CD Process:
* Refactor for multi-hypervisor is begun by landing small change with
feature flags keeping old code paths alive.
* New hypervisor is begun with skeleton implementation.
* Refactor includes skeleton implementation in refactoring of all
drivers
* New hypervisor gets to a minimal working state and encourages users
to try it.
* Multi-hypervisor also reaches minimal working state and encourages
users to try it. Users can now try both without pulling special branches
or merging any code, simply by trying different configurations.
* Incremental improvements go forward with both while any trunk users
who have turned either or both on provide feedback.
* Feature freeze arrives. Neither developers nor the users who are
testing notice or care. Review queue is not bombed with giant last
minute patches.
* Release happens with a more integrated, better tested combination of
multi-hypervisor and new hypervisor, and with old behavior in-tact for
users affected by any regressions.
* Open next release series, new-hypervisor is proposed to be enabled by
default and old code paths removed.
I understand that this is a straw-man. I know that other things can
happen. But I see a lot of development happen under the stable release
process scenario above. I fail to see where having new code arrive
incrementally in small batches and protected by feature flags is going
to make anything worse. Those things, however, should make continuous
delivery of OpenStack much smoother.
More information about the OpenStack-dev
mailing list