[openstack-dev] [tripleo] Managing no-mergepy template duplication

Clint Byrum clint at fewbar.com
Fri Dec 5 18:49:20 UTC 2014

Excerpts from Steven Hardy's message of 2014-12-04 01:09:18 -0800:
> On Wed, Dec 03, 2014 at 06:54:48PM -0800, Clint Byrum wrote:
> > Excerpts from Dan Prince's message of 2014-12-03 18:35:15 -0800:
> > > On Wed, 2014-12-03 at 10:11 +0000, Steven Hardy wrote:
> > > > Hi all,
> > > > 
> > > > Lately I've been spending more time looking at tripleo and doing some
> > > > reviews. I'm particularly interested in helping the no-mergepy and
> > > > subsequent puppet-software-config implementations mature (as well as
> > > > improving overcloud updates via heat).
> > > > 
> > > > Since Tomas's patch landed[1] to enable --no-mergepy in
> > > > tripleo-heat-templates, it's become apparent that frequently patches are
> > > > submitted which only update overcloud-source.yaml, so I've been trying to
> > > > catch these and ask for a corresponding change to e.g controller.yaml.
> > > > 
> > > > This raises the following questions:
> > > > 
> > > > 1. Is it reasonable to -1 a patch and ask folks to update in both places?
> > > 
> > > Yes! In fact until we abandon merge.py we shouldn't land anything that
> > > doesn't make the change in both places. Probably more important to make
> > > sure things go into the new (no-mergepy) templates though.
> > > 
> > > > 2. How are we going to handle this duplication and divergence?
> > > 
> > > Move as quickly as possible to the new without-mergepy varients? That is
> > > my vote anyways.
> > > 
> > > > 3. What's the status of getting gating CI on the --no-mergepy templates?
> > > 
> > > Devtest already supports it by simply setting an option (which sets an
> > > ENV variable). Just need to update tripleo-ci to do that and then make
> > > the switch.
> > > 
> > > > 4. What barriers exist (now that I've implemented[2] the eliding functionality
> > > > requested[3] for ResourceGroup) to moving to the --no-mergepy
> > > > implementation by default?
> > > 
> > > None that I know of.
> > > 
> > 
> > I concur with Dan. Elide was the last reason not to use this.
> That's great news! :)
> > One thing to consider is that there is no actual upgrade path from
> > non-autoscaling-group based clouds, to auto-scaling-group based
> > templates. We should consider how we'll do that before making it the
> > default. So, I suggest we discuss possible upgrade paths and then move
> > forward with switching one of the CI jobs to using the new templates.
> This is probably going to be really hard :(
> The sort of pattern which might work is:
> 1. Abandon mergepy based stack
> 2. Have helper script to reformat abandon data into nomergepy based adopt
> data
> 3. Adopt stack
> Unforunately there are several abandon/adopt bugs we'll have to fix if we
> decide this is the way to go (original author hasn't maintained it, but we
> can pick up the slack if it's on the critical path for TripleO).
> An alternative could be the external resource feature Angus is looking at:
> https://review.openstack.org/#/c/134848/
> This would be more limited (we just reference rather than manage the
> existing resources), but potentially safer.
> The main risk here is import (or subsequent update) operations becoming
> destructive and replacing things, but I guess to some extent this is a risk
> with any change to tripleo-heat-templates.

So you and I talked on IRC, but I want to socialize what we talked about

The abandon/adopt pipeline is a bit broken in Heat and hasn't proven to be
as useful as I'd hoped when it was first specced out. It seems too broad,
and relies on any tools understanding how to morph a whole new format
(the abandon json).

With external_reference, the external upgrade process just needs to know
how to morph the template. So if we're combining 8 existing servers into
an autoscaling group, we just need to know how to make an autoscaling
group with 8 servers as the external reference ids. This is, I think,
the shortest path to a working solution, as I feel the external
reference work in Heat is relatively straight forward and the spec has
widescale agreement.

There was another approach I mentioned, which is that we can teach Heat
how to morph resources. So we could teach Heat that servers can be made
into autoscaling groups, and vice-versa. This is a whole new feature
though, and IMO, something that should be tackled _after_ we make it
work with the external_reference feature, as this is basically a
superset of what we'll do externally.

> Has any thought been given to upgrade CI testing?  I'm thinking grenade or
> grenade-style testing here where we test maintaing a deployed overcloud
> over an upgrade of (some subset of) changes.
> I know the upgrade testing thing will be hard, but to me it's a key
> requirement to mature heat-driven updates vs those driven by external
> tooling.

Upgrade testing is vital to the future of the project IMO. We really
haven't validated the image based update method upstream yet. In Helion,
we're using tripleo-ansible for updates, and that works great, but we
need to get that or something similar into the pipeline for the gate,
or every user who adopts will be left with a ton of work if they want
to do upgrades.

The approach we've used for testing in Helion is to deploy the new commit,
then generate a new image with a new file, and upgrade (thanks JP for your
amazing work on this btw, hopefully we can realize this upstream soon. :)

That is not ideal though. What we need to do is test upgrading from
the last commit to the new one, and arguably, also from the last stable
release to the new commit (ala grenade).

More information about the OpenStack-dev mailing list