[openstack-dev] [all] gate debugging
Jeremy Stanley
fungi at yuggoth.org
Wed Aug 27 19:11:30 UTC 2014
On 2014-08-27 14:54:55 -0400 (-0400), Sean Dague wrote:
[...]
> I think we break down on communication when we get into a
> conversation of "I want to learn gate debugging" because I don't
> quite know what that means, or where the starting point of
> understanding is. So those intentions are well meaning, but tend
> to stall. The reality was there was no road map for those of us
> that dive in, it's just understanding how OpenStack holds together
> as a whole and where some of the high risk parts are. And a lot of
> that comes with days staring at code and logs until patterns
> emerge.
[...]
One way to put this in perspective, I think, is to talk about
devstack-gate integration test jobs (which are only one of a variety
of kinds of jobs we gate on, but it's possibly the most nebulous
case).
Since devstack-gate mostly just sets up an OpenStack (for a variety
of definitions thereof) and then runs some defined suite of
transformations and tests against it, a failure really is quite
often "this cloud broke." You are really looking, post-mortem, at
what would in production probably be considered a catastrophic
cascade failure involving multiple moving parts, where all you have
left is (hopefully enough, sometimes not) logs of what the services
were doing when all hell broke loose. However, you're an ops team of
one trying to get to the bottom of why your environment went toes
up... and then you're a developer trying to work out what to patch
where to make it not happen again (if you're lucky).
That is "gate debugging" and, to support your point, is something
which can at best be only vaguely documented.
--
Jeremy Stanley
More information about the OpenStack-dev
mailing list