[openstack-dev] [TripleO] os-refresh-config run frequency

jang at ioctl.org jang at ioctl.org
Thu Jun 26 11:24:48 UTC 2014


On Thu, 26 Jun 2014, Macdonald-Wallace, Matthew wrote:

> Hi all,
> 
> I've been working more and more with TripleO recently and whilst it does 
> seem to solve a number of problems well, I have found a couple of 
> idiosyncrasies that I feel would be easy to address.
> 
> My primary concern lies in the fact that os-refresh-config does not run 
> on every boot/reboot of a system.  Surely a reboot *is* a configuration 
> change and therefore we should ensure that the box has come up in the 
> expected state with the correct config?

I'm in complete agreement with this. The odd split between OS-provided 
service management mechanisms and OS-provided ones* seems a bit bizarre to 
me.

I'd think it makes a great deal of sense for os-collect-config to consider 
a reboot a configuration/state change.

* Operating System, OpenStack, resp. (see what I did there?)


> My secondary concern is that through not running os-refresh-config on a 
> regular basis by default (i.e. every 15 minutes or something in the same 
> style as chef/cfengine/puppet), we leave ourselves exposed to someone 
> trying to make a "quick fix" to a production node and taking that node 
> offline the next time it reboots because the config was still left as 
> broken owing to a lack of updates to HEAT (I'm thinking a "quick change" 
> to allow root access via SSH during a major incident that is then left 
> unchanged for months because no-one updated HEAT).

This'd represent a larger body of work; I don't want to tie the two halves 
of this suggestion together. The o-r-c scripts are written with the 
simplifying assumption that they don't get run very often - so they tend 
to be pretty brutal in their approach.

Some support machinery to detect changed config, etc, and permit o-r-c 
scripts to make restart decisions in a bit more of a graceful fashion 
would at least be required. This could potentially be worked into a part
of live upgrading.


> I'm sure there are other solutions to these problems, however I know 
> from experience that claiming this is solved through "education of 
> users" or (more severely!) via HR is not a sensible approach to take as 
> by the time you realise that your configuration has been changed for the 
> last 24 hours it's often too late!

I'm sold on there needing to be some machinery in place to assert this in 
an operating environment. I'd rather see some of the obvious pain-points 
with the current o-r-c semantics dealt with first.


Is there actually any _harm_ to running it on every boot?


-- 
Update your address books: jang at ioctl.org  http://ioctl.org/jan/
"No generalised law is without exception." A self-demonstrating axiom.



More information about the OpenStack-dev mailing list