[openstack-dev] [TripleO] os-refresh-config run frequency

Michael Kerrin michael.kerrin at hp.com
Thu Jul 17 14:54:26 UTC 2014


On Thursday 26 June 2014 12:20:30 Clint Byrum wrote:
> Excerpts from Macdonald-Wallace, Matthew's message of 2014-06-26 04:13:31 
-0700:
> > Hi all,
> > 
> > I've been working more and more with TripleO recently and whilst it does
> > seem to solve a number of problems well, I have found a couple of
> > idiosyncrasies that I feel would be easy to address.
> > 
> > My primary concern lies in the fact that os-refresh-config does not run on
> > every boot/reboot of a system.  Surely a reboot *is* a configuration
> > change and therefore we should ensure that the box has come up in the
> > expected state with the correct config?
> > 
> > This is easily fixed through the addition of an "@reboot" entry in
> > /etc/crontab to run o-r-c or (less easily) by re-designing o-r-c to run
> > as a service.
> > 
> > My secondary concern is that through not running os-refresh-config on a
> > regular basis by default (i.e. every 15 minutes or something in the same
> > style as chef/cfengine/puppet), we leave ourselves exposed to someone
> > trying to make a "quick fix" to a production node and taking that node
> > offline the next time it reboots because the config was still left as
> > broken owing to a lack of updates to HEAT (I'm thinking a "quick change"
> > to allow root access via SSH during a major incident that is then left
> > unchanged for months because no-one updated HEAT).
> > 
> > There are a number of options to fix this including Modifying
> > os-collect-config to auto-run os-refresh-config on a regular basis or
> > setting os-refresh-config to be its own service running via upstart or
> > similar that triggers every 15 minutes
> > 
> > I'm sure there are other solutions to these problems, however I know from
> > experience that claiming this is solved through "education of users" or
> > (more severely!) via HR is not a sensible approach to take as by the time
> > you realise that your configuration has been changed for the last 24
> > hours it's often too late!
> So I see two problems highlighted above.
> 
> 1) We don't re-assert ephemeral state set by o-r-c scripts. You're right,
> and we've been talking about it for a while. The right thing to do is
> have os-collect-config re-run its command on boot. I don't think a cron
> job is the right way to go, we should just have a file in /var/run that
> is placed there only on a successful run of the command. If that file
> does not exist, then we run the command.
> 
> I've just opened this bug in response:
> 
> https://bugs.launchpad.net/os-collect-config/+bug/1334804
> 

I have been looking into bug #1334804 and I have a review up to resolve it. I 
want to highlight something.

Currently on a reboot we start all services via upstart (on debian anyways) 
and there have been quite a lot of issues around this - missing upstart 
scripts and timing issues. I don't know the issues on fedora.

So with a fix to #1334804, on a reboot upstart will start all the services 
first (with potentially out-of-date configuration), then o-c-c will start o-r-
c and will now configure all services and restart them or start them if 
upstart isn't configured properly.

I would like to turn off all boot scripts for services we configure and leave 
all this to o-r-c. I think this will simplify things and put us in control of 
starting services. I believe that it will also narrow the gap between fedora 
and debian or debian and debian so what works on one should work on the other 
and make it easier for developers.

Having the ability to service nova-api stop|start|restart is very handy but 
this will be a manually thing and I intend to leave that there.

What do people think and how best do I push this forward. I feel that this 
leads into the the re-assert-system-state spec but mainly I think this is a 
bug and doesn't require a spec.

I will be at the tripleo mid-cycle meetup next and willing to discuss this 
with anyone interested in this and put together the necessary bits to make 
this happen.

Michael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140717/08dab00a/attachment.html>


More information about the OpenStack-dev mailing list