[openstack-dev] [TripleO] os-refresh-config run frequency

Robert Collins robertc at robertcollins.net
Sun Jul 20 19:30:41 UTC 2014


Sure. Put it in the agenda perhaps Tuesday morning
On 20 Jul 2014 12:11, "Chris Jones" <cmsj at tenshu.net> wrote:

> Hi
>
> I also have some strong concerns about this. Can we get round a table this
> week and hash it out?
>
> Cheers,
> Chris
>
>
> >> On 20 Jul 2014, at 14:51, Dan Prince <dprince at redhat.com> wrote:
> >>
> >>> On Thu, 2014-07-17 at 15:54 +0100, Michael Kerrin wrote:
> >>> On Thursday 26 June 2014 12:20:30 Clint Byrum wrote:
> >>>
> >>> Excerpts from Macdonald-Wallace, Matthew's message of 2014-06-26
> >> 04:13:31 -0700:
> >>
> >>>> Hi all,
> >>
> >>
> >>>> I've been working more and more with TripleO recently and whilst
> >> it does
> >>
> >>>> seem to solve a number of problems well, I have found a couple of
> >>
> >>>> idiosyncrasies that I feel would be easy to address.
> >>
> >>
> >>>> My primary concern lies in the fact that os-refresh-config does
> >> not run on
> >>
> >>>> every boot/reboot of a system. Surely a reboot *is* a
> >> configuration
> >>
> >>>> change and therefore we should ensure that the box has come up in
> >> the
> >>
> >>>> expected state with the correct config?
> >>
> >>
> >>>> This is easily fixed through the addition of an "@reboot" entry in
> >>
> >>>> /etc/crontab to run o-r-c or (less easily) by re-designing o-r-c
> >> to run
> >>
> >>>> as a service.
> >>
> >>
> >>>> My secondary concern is that through not running os-refresh-config
> >> on a
> >>
> >>>> regular basis by default (i.e. every 15 minutes or something in
> >> the same
> >>
> >>>> style as chef/cfengine/puppet), we leave ourselves exposed to
> >> someone
> >>
> >>>> trying to make a "quick fix" to a production node and taking that
> >> node
> >>
> >>>> offline the next time it reboots because the config was still left
> >> as
> >>
> >>>> broken owing to a lack of updates to HEAT (I'm thinking a "quick
> >> change"
> >>
> >>>> to allow root access via SSH during a major incident that is then
> >> left
> >>
> >>>> unchanged for months because no-one updated HEAT).
> >>
> >>
> >>>> There are a number of options to fix this including Modifying
> >>
> >>>> os-collect-config to auto-run os-refresh-config on a regular basis
> >> or
> >>
> >>>> setting os-refresh-config to be its own service running via
> >> upstart or
> >>
> >>>> similar that triggers every 15 minutes
> >>
> >>
> >>>> I'm sure there are other solutions to these problems, however I
> >> know from
> >>
> >>>> experience that claiming this is solved through "education of
> >> users" or
> >>
> >>>> (more severely!) via HR is not a sensible approach to take as by
> >> the time
> >>
> >>>> you realise that your configuration has been changed for the last
> >> 24
> >>
> >>>> hours it's often too late!
> >>
> >>> So I see two problems highlighted above.
> >>
> >>
> >>> 1) We don't re-assert ephemeral state set by o-r-c scripts. You're
> >> right,
> >>
> >>> and we've been talking about it for a while. The right thing to do
> >> is
> >>
> >>> have os-collect-config re-run its command on boot. I don't think a
> >> cron
> >>
> >>> job is the right way to go, we should just have a file in /var/run
> >> that
> >>
> >>> is placed there only on a successful run of the command. If that
> >> file
> >>
> >>> does not exist, then we run the command.
> >>
> >>
> >>> I've just opened this bug in response:
> >>
> >>
> >>> https://bugs.launchpad.net/os-collect-config/+bug/1334804
> >>
> >>
> >>
> >>
> >> I have been looking into bug #1334804 and I have a review up to
> >> resolve it. I want to highlight something.
> >>
> >>
> >>
> >> Currently on a reboot we start all services via upstart (on debian
> >> anyways) and there have been quite a lot of issues around this -
> >> missing upstart scripts and timing issues. I don't know the issues on
> >> fedora.
> >>
> >>
> >>
> >> So with a fix to #1334804, on a reboot upstart will start all the
> >> services first (with potentially out-of-date configuration), then
> >> o-c-c will start o-r-c and will now configure all services and restart
> >> them or start them if upstart isn't configured properly.
> >>
> >>
> >>
> >> I would like to turn off all boot scripts for services we configure
> >> and leave all this to o-r-c. I think this will simplify things and put
> >> us in control of starting services. I believe that it will also narrow
> >> the gap between fedora and debian or debian and debian so what works
> >> on one should work on the other and make it easier for developers.
> >
> > I'm not sold on this approach. At the very least I think we want to make
> > this optional because not all deployments may want to have o-r-c be the
> > central service starting agent. So I'm opposed to this being our (only!)
> > default...
> >
> > The job of o-r-c in this regard is to assert state... which to me means
> > making sure that a service is configured correctly (config files, set to
> > start on boot, and initially started). Requiring o-r-c to be the service
> > starting agent (always) is beyond the scope of the o-r-c tool.
> >
> > If people want to use it in that mode I think having an *option* to do
> > this is fine. I don't think it should be required though. Furthermore I
> > don't think we should get into the habit of writing our elements in such
> > a matter that things no longer start on boot without o-r-c in the mix.
> >
> > I do think we can solve these problems. But taking a hardwired
> > prescriptive approach is not good here...
> >
> >>
> >>
> >>
> >> Having the ability to service nova-api stop|start|restart is very
> >> handy but this will be a manually thing and I intend to leave that
> >> there.
> >>
> >>
> >>
> >> What do people think and how best do I push this forward. I feel that
> >> this leads into the the re-assert-system-state spec but mainly I think
> >> this is a bug and doesn't require a spec.
> >>
> >>
> >>
> >> I will be at the tripleo mid-cycle meetup next and willing to discuss
> >> this with anyone interested in this and put together the necessary
> >> bits to make this happen.
> >>
> >>
> >>
> >> Michael
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> OpenStack-dev at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140721/9c364c3c/attachment.html>


More information about the OpenStack-dev mailing list