[Openstack-operators] [TripleO] Default logging configuration
jaypipes at gmail.com
Wed Mar 12 01:27:49 UTC 2014
On Wed, 2014-03-12 at 08:28 +1300, Robert Collins wrote:
> Hi - I filed https://bugs.launchpad.net/tripleo/+bug/1290759 last
> night after we found yet another CI issue that just cannot be debugged
> with the default logging configuration and also chatting with some of
> our public cloud folk who are experimenting with TripleO -- turns out
> that we don't run the default logging configuration in public cloud
> because it's not suitable for analysing and solving issues.
> I'm seeking a copy of the actual configuration thats used, though it
> sounds like its just 'debugging' except for Neutron which was tooooo
> However, I'd like to do a poll - what do you, dear operators, do for logging.
> - Whats your default log level (upstream default, verbose, debug,
> something special?)
Pretty much has to be DEBUG if you want to find anything useful,
> - How do you handle log overload?
> - Do you overlog and prune on display (e.g. via logstash / kibana)\
> - Do you increase logging only after a fault and hope it is reproducible?
> - Do you overlog and just deal with the volume?
> - Do you not care about logs for fault analysis?
> - What would you like to see most from upstream log defaults?
1) Better and more consistent information in debug messages.
Sean Dague has an effort already underway to tackle this issue across
projects. Things like dumping the contents of message packets into a
debug log message are, IMO, useless and clutter even the debug-level
logs. The more clutter in the logs like that (and most of the
information in the various WSGI and keystoneclient.middleware.auth_token
modules) makes it very difficult to identify more important things, like
when certain important events in a task chain occur.
2) More INFO level messages
INFO level is underused IMO. INFO messages should generally be short(er
than debug messages), provide a codified message that indicates that a
condition was met or an event succeeded or failed (when the failure was
expected). We should be able to view a log with only INFO log level
messages and pretty clearly see the flow of a request through one or
more OpenStack endpoints. Unfortunately, today, there is little
consitency on where important "checkpoint" events log an INFO message.
3) Better consistency about the PID or thread/greenlet identifiers.
Some modules use a different format string than others to show the
process id and/or greenlet id, which makes debugging thread-related
> Right now, TripleO is in an awkward position - we're realising that
> the defaults in OpenStack are not suitable for production use, but the
> developer community believes the problem is *too much logging* vs *too
> little* - so I'm seeking to get more information, and find out what
> other ops folk are doing, with the intent of
> getting one (or perhaps more) basic profiles that meet operator needs
> without everyone reinventing the wheel.
More information about the OpenStack-operators