[openstack-dev] [tripleo] Idempotence of the deployment process
Alex Schultz
aschultz at redhat.com
Fri Mar 31 23:21:12 UTC 2017
Hey folks,
I wanted to raise awareness of the concept of idempotence[0] and how
it affects deployment(s). In the puppet world, we consider this very
important because since puppet is all about ensuring a desired state
(ie. a system with config files + services). That being said, I feel
that it is important for any deployment tool to be aware of this.
When the same code is applied to the system repeatedly (as would be
the case in a puppet master deployment) the subsequent runs should
result in no changes if there is no need. If you take a configured
system and rerun the same deployment code you don't want your services
restarting when the end state is supposed to be the same. In the case
of TripleO, we should be able deploy an overcloud and rerun the
deployment process should result in no configuration changes and 0
services being restarted during the process. The second run should
essentially be a noop.
We have recently uncovered various bugs[1][2][3][4] that have
introduced service disruption due to a lack of idempotency causing
service restarts. So when reviewing or developing new code what is
important about the deployment is to think about what happens if I run
this bit of code twice. There are a few common items that come up
around idempotency. Things like execs in puppet-tripleo should be
refreshonly or use unless/onlyif to prevent running again if
unnecessary. Additionally in the TripleO configuration it's important
to understand in which step a service is configured and if it possibly
would get deconfigured in another step. For example, we configure
apache and some wsgi services in step 3. But we currently configure
some additional wsgi openstack services in step 4 which is resulting
in excessive httpd restarts and possible service unavailability[5]
when updates are applied.
Another important place to understand this concept is in upgrades
where we currently allow for ansible tasks to be used. These should
result in an idempotent action when puppet is subsequently run which
means that the two bits of code essentially need to result in the same
configuration. For example in the nova-api upgrades for Newton to
Ocata we needed to run the same commands[6] that would later be run by
puppet to prevent clashing configurations and possible idempotency
problems.
Idempotency issues can cause service disruptions, longer deployment
times for end users, or even possible misconfigurations. I think it
might be beneficial to add an idempotency periodic job that is
basically a double run of the deployment process to ensure no service
or configuration changes on the second run. Thoughts? Ideally one in
the gate would be awesome but I think it would take to long to be
feasible with all the other jobs we currently run.
Thanks,
-Alex
[0] http://binford2k.com/content/2015/10/idempotence-not-just-big-scary-word
[1] https://bugs.launchpad.net/tripleo/+bug/1664650
[2] https://bugs.launchpad.net/puppet-nova/+bug/1665443
[3] https://bugs.launchpad.net/tripleo/+bug/1665405
[4] https://bugs.launchpad.net/tripleo/+bug/1665426
[5] https://review.openstack.org/#/c/434016/
[6] https://review.openstack.org/#/c/405241/
More information about the OpenStack-dev
mailing list