Open Stack

Mon Jan 25 22:11:42 UTC 2016

On Mon, Jan 25, 2016 at 03:08:07PM -0500, James Slagle wrote:
>    On Mon, Jan 25, 2016 at 1:09 PM, Dan Prince <dprince at redhat.com> wrote:
> 
>      As for tripleo-heat-templates... sure it hasn't been an entirely smooth
>      ride. I think a lot of the pain has to do with a lack of CI coverage on
>      proper upgrade paths upstream. I think this has less to do with the
>      architecture (certainly not the fact that it uses YAML) and more to do
>      with the fact that we may have cut corners, jammed a bunch of features
>      in really fast, and don't have CI to cover proper upgrades jobs.
> 
>    â**I think it has a lot less to do with that then it may appear on the
>    surface. Even if we had 100% functional test coverage of upgrade paths,
>    that does nothing to solve for scenarios where the templates have been so
>    heavily customized that stack updates can not be guaranteed to work, or
>    worse, destroy the cloud.

>    The most we could do in those scenarios is better detection around what
>    stack changes are going to be done by an update before anything is
>    actually applied. And if something is going to be destructive or is not
>    expected, we refuse to proceed. That would definitely help us as
>    developers to know if our own changes are breaking upgrades.

FYI I'm working on getting these patches landed, then we can wire exactly
this preview capability in to TripleO:

https://review.openstack.org/#/c/268997/
https://review.openstack.org/#/c/269176/

I think there are some other easy-wins, such as introducing a
userdata_update_policy property to OS::Nova::Server, which would mean you
aren't forced to rebuild an entire ResourceGroup just because user-data for
new instances changes.  Similarly, introduce an "ignore" option for
existing flavor/image update policy properties.

But, yeah, we need to get way better at testing and detecting these sorts
of consequences, and loudly warning the operator before they happen.

>    But, that doesn't really help the user get their cloud upgraded if they've
>    customized templates themselves, other than they know that they must undo
>    whatever custom changes they made, and if they made those for valid
>    reasons and can't undo, then they are stuck.
> 
>    They've taken what may appear as a "feature" of the system -- customizing
>    templates to add support for new things -- and turned it into an
>    anti-feature, now they can't upgrade without writing upgrade support for
>    whatever they added themself. Perhaps having to undo new features we as
>    TripleO developers added for users in the first place.

FWIW I think you're overstating things here, we've got a well defined and
documented [1] set of ExtraConfig interfaces which should be safe to use
over upgrades (we need better CI testing of them tho).

If folks start hacking on and relying on internal implementation details of
any other project, they would expect upgrade pain, and TripleO is no
different - if you start hacking on undocumented internal interfaces,
you'll likely have more work to do come upgrade time.  Try forking a bunch
of puppet modules, same problem.

If anything we need to evolve this into an even more strongly defined
"plugin" type interface, where explicit steps are supported and folks can
interface their stuff (Dan's composable roles prototype[2] already makes
progress towards this btw).

Cheers,

Steve

[1] "Node customization and Third-Party Integration"
http://docs.openstack.org/developer/tripleo-docs/advanced_deployment/extra_config.html

[2] https://review.openstack.org/#/c/236243/11/puppet/roles/README.rst

Open Stack

[openstack-dev] [TripleO] Should we have a TripleO API, or simply use Mistral?

OpenStack

Community

Documentation

Branding & Legal