[openstack-dev] [TripleO] Forming our plans around Ansible

David Moreau Simard dms at redhat.com
Fri Jul 7 21:31:10 UTC 2017


On Fri, Jul 7, 2017 at 1:50 PM, James Slagle <james.slagle at gmail.com> wrote:
> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.

I don't want to de-rail the thread but I really want to bring some
attention to a pattern that tripleo-quickstart has been using across
it's playbooks and roles.
I sincerely hope that we can find a better implementation should we
start developing new things from scratch.

I'll sound like a broken record for those that have heard me mention
this before but for those that haven't, here's a concrete example of
how things are done today:
(Sorry for the link overload, making sure the relevant information is available)

For an example tripleo-quickstart job, here's the console [1] and it's
corresponding ARA report [2]:
- A bash script is created [3][4][5] from a jinja template [6]
- A task executes the bash script [7][8][9]

My understanding is that things are done this way in order to provide
automated documentation and make the builds reproducible.

One of Ansible's greatest strength is supposed to be it's simplicity:
making things readable and straightforward ("Automation for Everyone"
is it's motto).
It's hard for me to put succintly into words how complicated and
counter-intuitive the current pattern is making things so I'll provide
some examples.

1) When a task running a bash script fails, you don't know what failed
from the ansible-playbook output.
    You need to find the appropriate log file and look at the output
of the bash script there.

2) There is logic, conditionals and variables inside the templated
bash scripts making it non-trivial to guess what the script actually
ends up looking like once it is "compiled".
    If you happen to know that this task actually ran a templated bash
script in the first place, you need to know or remember where it is
located in the logs after the job is complete and then open it up.

3) There can be more than one operation inside a bash script so you
don't know which of those operations failed unless you look at the
logs.
    This reduces granularity which makes it harder to profile,
identify and troubleshoot errors.

4) You don't know what the bash script actually did (if it did
anything at all) unless you look at the logs

5) Idempotency is handled (or not) inside the bash scripts, oblivious
to Ansible really knowing if running the bash script changed something
or not

Here's an example ARA report from openstack-ansible where you're
easily able to tell what went wrong and what happened [10].

Now, I'm not being selfish and trying to say that things should be
written in a specific way so that it can make ARA more useful.
Yes, ARA would be more useful. But this is about following Ansible
best practices and making it more intuitive to understand how things
work and what happens when tasks run.
Puppet is designed the same way: there are resources and modules to do
things. You don't template bash scripts and then use Exec resources.

Documentation and reproducible builds are great things to have, but
not with this kind of tradeoff IMO.
Surely there are other means of providing documentation and reproducible builds.

TripleO is complicated enough already.
Actively making it simpler in every way we can, not just for
developers but for users and operators, should be a priority and a
theme throughout the refactor around Ansible.
We should agree on the best practices and use them.

[1]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html
[2]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/reports/d8f79fa8-c8db-4134-8696-795d04ba6f65.html
[3]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html#_2017-07-07_15_11_38_778824
[4]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/file/efa7400f-9f8a-4b02-b650-2060c7a3cec3/#line-1
[5]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/result/4b3cffd6-f252-4156-9f15-bceed6f12510/
[6]: https://github.com/openstack/tripleo-quickstart/blob/ec7b2d71f28efd301eafec8f53fc644c2fd8cc6e/roles/repo-setup/templates/repo_setup.sh.j2
[7]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html#_2017-07-07_15_11_42_330477
[8]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/file/666608c9-fada-49cb-b72c-9b93f8d2565b/#line-1
[9]: http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/result/49de1e7a-32f9-4e6e-9242-fb8afdb91d88/
[10]: http://logs.openstack.org/99/477599/2/check/gate-openstack-ansible-openstack-ansible-ceph-centos-7-nv/c1efa30/logs/ara/

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]



More information about the OpenStack-dev mailing list