[openstack-dev] debug logs and defaults was (Thoughts on the patch test failure rate and moving forward)

Robert Collins robertc at robertcollins.net
Thu Jul 24 22:05:19 UTC 2014


On 25 July 2014 08:01, Sean Dague <sean at dague.net> wrote:

>> I'd like us to think about whether they is anything we can do to make
>> life easier in these kind of hard debugging scenarios where the regular
>> logs are not sufficient.
>
> Agreed. Honestly, though we do also need to figure out first fail
> detection on our logs as well. Because realistically if we can't debug
> failures from those, then I really don't understand how we're ever going
> to expect large users to.


I'm so glad you said that :). In conversations with our users, and
existing large deployers of Openstack, one thing has come through very
consistently: our default logs are insufficient.

We had an extensive discussion about this in the TripleO mid-cycle
meetup, and I think we reached broad consensus on the following:
 - the defaults should be what folk are running in production
 - we don't want to lead on changing defaults - its a big enough thing
we want to drive the discussion but not workaround it by changing our
defaults
 - large clouds are *today* running debug (with a few tweaks to remove
the most egregious log spammers and known security issues [like
dumping tokens into logs]
 - AFAICT productised clouds (push-button deploy etc) are running
something very similar
 - we would love it if developers *also* saw what users will see by
default, since that will tend to both stop things getting to spammy,
and too sparse.

So - I know thats brief - what we'd like to do is to poll a slightly
wider set of deployers - e.g. via a spec, perhaps some help from Tom
with the users and ops groups - and get a baseline of things that
there is consensus on and things that aren't, and then just change the
defaults to match. Further, to achieve the 'developers see the same
thing as users' bit, we'd like to make devstack do what TripleO does -
use defaults for logging levels, particularly in the gate.

Its totally true that we have a good policy about logging and we're
changing things to fit it but thats the long term play: short term,
making the default meet our deployments seems realtively easy and
immensely sane.

-Rob
-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list