[openstack-dev] [tripleo] Contributing to TripleO is challenging
shardy at redhat.com
Mon Mar 21 10:57:21 UTC 2016
On Fri, Mar 18, 2016 at 01:27:33PM +0000, Arkady_Kanevsky at DELL.com wrote:
> Agree on the rant. But not clear on concrete proposal to fix it.
> Spend more time “fixing” CI and use Tempest as a gate is a bit wage.
> Unless we test known working version of each project in TripleO CI you are
> dependent on health of other components.
I've so far resisted replying to this thread, because while valid, many of
the concerns expressed by Emilien are quite general complaints, and it's
hard to reply with specific solutions.
However work *is* going on to improve many of these problems, let's see if
I can provide a summary, to clarify the various "concrete proposals" which
1. Core team & review velocity
We've had a small and very overloaded core team for a while now, and this
will be helped by expanding our community to include those who've been
regularly contributing excellent work and reviews as core reviewers:
Note that I personally think it's absolutely fine for folks to be more
expert in some subsystem and to focus review extra attention on e.g API,
UI, Puppet or whatever. This subsystem-core model has been well proven in
other projects, and folks will naturally broaden their areas of deeper
knowledge over time.
Related to this is movement of code, such as the puppet-tripleo refactoring
mentioned by Michael - this has already started, and will help with
providing a cleaner interface between the puppet and heat pieces (which
will also help focus reviewer attention appropriately).
2. Day 1 developer experience
This is closely related to the CI failure rate - there are efforts to
integrate with the RDO tripleo-quickstart tooling, which simplifies the
initial undercloud setup, and potentially makes consuming pre-built,
validated undercloud images (probably output artefacts from our new
periodic CI job) much easier.
So, this will mean that both developers and CI can potentially be less
regularly impacted by trunk regressions which often cause CI to fail, and
break developer environments.
3. CI coverage and trunk failure rate
We've been working really hard to improve things here, which are really
several inter-related issues:
- Lack of Hardware capacity in the tripleo CI cloud
- Frequent trunk regressions breaking our CI
- Lack of coverage of some key features (network isolation, SSL, IPv6, upgrades)
- Lack of coverage for vendor plugin templates/puppet code
There's work ongoing to improve this from multiple perspectives:
New periodic CI job (to be used for automated promotion of the
current-tripleo repo, and for pre-built undercloud images):
Add network isolation support to CI:
Test SSL enabled in overcloud:
CI coverage of IPv6:
Discussion around better documented integration for third-party CI:
In summary, we're doing a ton of work as a community to address the
concerns raised by Emilien, and we've still got a lot more to do, but there
*is* clear agreement on many of the problems, and a concrete plan in most
cases to resolve them.
More information about the OpenStack-dev