[openstack-dev] [tripleo] Contributing to TripleO is challenging

Adam Young ayoung at redhat.com
Sat Mar 5 02:03:10 UTC 2016

On 03/04/2016 09:23 AM, Emilien Macchi wrote:
> That's not the name of any Summit's talk, it's just an e-mail I wanted
> to write for a long time.
> It is an attempt to expose facts or things I've heard a lot; and bring
> constructive thoughts about why it's challenging to contribute in
> TripleO project.
> 1/ "I don't review this patch, we don't have CI coverage."
> One thing I've noticed in TripleO is that a very few people are involved
> in CI work.
> In my opinion, CI system is more critical than any feature in a product.
> Developing Software without tests is a bit like http://goo.gl/OlgFRc
> All people - specially core - in the project should be involved in CI
> work. If you are TripleO core and you don't contribute on CI, you might
> ask yourself why.

OK...so what is the state of Tripleo CI?  My experience with Tripleo has 
shown that it is quite resource intesive, far more so than, say, 
Keystone, and so I could see that being the gating factor.

In order for me to be able to get into Tripleo coding, I needed a new 
machine, with 32 Gb of Ram, separate from my everyday work machine.  Not 
a killer outlay, but enough to hold me up until I got the HW allocated.

If we could split up the testing undercloud vs. overcloud, it might be 
more feasable.  I see no fundamental reason that the majority of the 
Overcloud development and testing could not be done on top of a 
non-ironic based OpenStack deployment.

That leaves just the undercloud, which could, possibly, also run onto 
top of an existing OpenStack deployment for much of the development.

A true end to end run of Tripleo with HA requires a lot:  3 Physical 
machines plus a little overhead for the Overcloud.  But this is what is 
really needed.  Ideally, on multiple vendors' systems, so that we 
identify some aspect of the Hardware variation.

> 2/ "I don't review this patch, CI is broken."
> Another thing I've noticed in TripleO is that when CI is broken, again,
> a very few people are actually working on fixing failures.
> My experience over the last years taught me to stop my daily work when
> CI is broken and fix it asap.

Puppet and Heat are black boxes to me still.  I don't clearly understand 
how they fit together.

I think we need to start depuppetifying Tripleo. I know we have a lot of 
sunk costs in to it, but we went with Puppet because it was all we had, 
not that it well matched the problem set.

I'd recommend a freeze on all new Puppet development, and start doing 
all new features in Ansible. Fully acknowledging the havoc this will 
wreak,  I think it is important strategically.   It is really hard to 
swap between two languages, and the rest of OpenStack in Python.  
Switching to Ruby is hard.

All of our Client support is in Python.

The number of people that know Puppet that actively contribute to 
OpenStack is small. The number of real Ruby experts is smaller.

> 3/ "I don't review it, because this feature / code is not my area".
> My first though is "Aren't we supposed to be engineers and learn new areas?"
> My second though is that I think we have a problem with TripleO Heat
> Templates.
> THT or TripleO Heat Templates's code is 80% of Puppet / Hiera. If
> TripleO core say "I'm not familiar with Puppet", we have a problem here,
> isn't?
> Maybe should we split this repository? Or revisit the list of people who
> can +2 patches on THT.
I am more than happy to review anything Keystone related, but again, I 
struggle with Puppet.

Not really knowing Heat as well makes it even tougher. We need a better 
overall orientation guide if people are going to come up to speed quicker.

> 4/ Patches are stalled. Most of the time.
> Over the last 12 months, I've pushed a lot of patches in TripleO and one
> thing I've noticed is that if I don't ping people, my patch got no
> review. And I have to rebase it, every week, because the interface
> changed. I got +2, cool ! Oh, merge conflict. Rebasing. Waiting for +2
> again... and so on..
Same is true on Keystone.  There is just a lot to get done on this 
project.  All these projects.

> I personally spent 20% of my time to review code, every day.
> I wrote a blog post about how I'm doing review, with Gertty:
> http://my1.fr/blog/reviewing-puppet-openstack-patches/
> I suggest TripleO folks to spend more time on reviews, for some reasons:

Nice of you to write that up.
> * decreasing frustration from contributors
> * accelerate development process
> * teach new contributors to work on TripleO, and eventually scale-up the
> core team. It's a time investment, but worth it.
> In Puppet team, we have weekly triage sessions and it's pretty helpful.
> 5/ Most of the tests are run... manually.
> How many times I've heard "I've tested this patch locally, and it does
> not work so -1".
> The only test we do in current CI is a ping to an instance. Seriously?
> Most of OpenStack CIs (Fuel included), run Tempest, for testing APIs and
> real scenarios. And we run a ping.
> That's similar to 1/ but I wanted to raise it too.
Again, testing is expensive; if I am testing a patch, my one and only 
development system can't be used for development.  If we can find a way 
top make things lighter, we can get more done.

> If we don't change our way to work on TripleO, people will be more
> frustrated and reduce contributions at some point.
> I hope from here we can have a open and constructive discussion to try
> to improve the TripleO project.
> Thank you for reading so far.
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160304/341f45f3/attachment.html>

More information about the OpenStack-dev mailing list