[OpenStack-Infra] Log storage/serving

Clark Boylan clark.boylan at gmail.com
Thu Oct 10 18:06:18 UTC 2013


On Thu, Oct 10, 2013 at 10:42 AM, James E. Blair <jeblair at openstack.org> wrote:
> Joshua Hesketh <joshua.hesketh at rackspace.com> writes:
>
>> On 9/25/13 2:47 AM, James E. Blair wrote:
>>> Joshua Hesketh <joshua.hesketh at rackspace.com> writes:
>>>
>>>> On 9/17/13 11:00 PM, Monty Taylor wrote:
>>>>> On 09/16/2013 07:22 PM, Joshua Hesketh wrote:
>>>>>> So if zuul dictates where a log goes and we place the objects in swift
>>>>>> with that path (change / patchset / pipeline / job / run) then zuul
>>>>>> could also handle placing indexes as it should know which objects to
>>>>>> expect.
>>>>>>
>>>>>> That said, if the path is deterministic (such as that) and the workers
>>>>>> provide the index for a run then I'm not sure how useful an index for
>>>>>> patchsets would be. I'd be interested to know if anybody uses the link
>>>>>> http://logs.openstack.org/34/45334/ without having come from gerrit or
>>>>>> another source where it is published.
>>>>> https://pypi.python.org/pypi/git-os-job
>>>> Right, but that calculates the path (as far as I can see) so we
>>>> therefore still don't necessarily need indexes generated.
>>> The final portion of the URL, signifying the run, is effectively random.
>>> So that tool actually relies on a one-level-up index page.  (That tool
>>> works on post jobs rather than check or gate, but the issues are
>>> similar).
>>
>> So two questions,
>> 1) Do we need a random job run? is it for debugging or something? And
>> if so, can we provide it another way.
>
> I don't understand this question -- are you asking "does anyone need to
> access a run other than the one left as a comment in gerrit?"  That's
> answered in my text you quoted below.
>
>> 2) What if the tool provided the index for its runs?
>
> I think we agree that would be fairly easy, at least starting from the
> point of the individual run and working down the tree.  I think it's the
> indexes of runs that complicates this.
>
>>>
>>> Other than that, most end users do not use indexes outside of the
>>> particular job run, and that's by design.  We try to put the most useful
>>> URL in the message that is left in Gerrit.
>>>
>>> However, those of us working on the infrastructure itself, or those
>>> engaged in special projects (such as mining old test logs), or even the
>>> occasional person curious about whether the problem they are seeing was
>>> encountered in _all_ runs of a test find the ability to locate logs from
>>> any run _very_ useful.  If we lost that ability, we would literally have
>>> no way to locate any logs other than the 'final' logs of a run, and
>>> those only through the comment left in Gerrit, due to the issue
>>> mentioned above.
>>>
>>> We can discuss doing that, but it would be a huge change from our
>>> current practice.
>> Yep, I'm convinced that the logs need to be accessible.
>
> Okay, let me try to summarize current thinking:
>
> * We want to try to avoid writing a tool that receives logs because
>   swift provides most/all of the needed functionality.
>   * The swift tempurl middleware will allow us to have the client
>     directly PUT files in swift using a HMAC signed token.
>   * This means any pre-processing of logs would need to happen with the
>     log-uploading-client or via some unspecified event trigger.
>
> * We may or may not want a log-serving app.
>   * We're doing neat things like filtering on level and html-ifying logs
>     as we serve them with our current log-serving app.
>   * We could probably do that processing pre-upload (including embedding
>     javascript in html pages to do the visual filtering) and then we
>     could serve static pages instead.
>   * A log serving app may be required to provide some kinds of indexes.
>
> So to decide on the log-serving app, we need to figure out:
>
> 1) What do we want out of indexes?
>
> Let's take a current example log path:
>
>   http://logs.openstack.org/95/50795/4/check/check-grenade-devstack-vm/3c17e3c/console.html
>
> Ignoring the change[:-2] at the beginning since it's an implementation
> artifact, that's basically:
>
>   /change/patchset/pipeline/job/run[random]/
>
> The upload script can easily handle creating index pages below that
> point.  But since it runs in the context of a job run, it can't create
> index pages above that (besides the technical difficulty, we don't want
> to give it permission outside of its run anyway).  So I believe that
> without a log-receiving app, our only options are:
>
>   a) Use the static web swift middleware to provide indexes.  Due to the
>   intersection of this feature, CDN, and container sizes with our
>   current providers, this is complicated and we end up at a dead end
>   every time we talk through it.
>
>   b) Use a log-serving application to generate index pages where we need
>   them.  We could do this by querying swift.  If we eliminate the
>   ability to list ridiculously large indexes (like all changes, etc) and
>   restrict it down to the level of, say, a single change, then this
>   might be manageable.  However, swift may still have to perform a large
>   query to get us down to that level.
>
>   c) Reduce the discoverability of test runs.  We could actually just
>   collapse the whole path into a random string and leave that as a
>   comment in Gerrit.  Users would effectively never be able to discover
>   any runs other than the final ones that are reported in Gerrit, and
>   even comparing runs for different patchsets would involve looking up
>   the URL for each in the respective Gerrit comments.  Openstack-infra
>   tools, such as elastic-recheck, could still discover other runs by
>   watching for ZMQ or Gearman events.
>
>   This would make little difference to most end-users as well as project
>   tooling, but it would make it a little harder to develop new project
>   tooling without access to that event stream.
>
> Honestly, option C is growing on me, but I'd like some more feedback on
> that.
>
> 2) What do we want out of processing?
>
> Currently we HTMLify and filter logs by log level at run-time when
> serving them.  I think our choices are:
>
>   a) Continue doing this -- this requires a log-serving app that will
>   fetch logs from swift, process them, and serve them.
>
>   b) Pre-process logs before uploading them.  HTMLify and add
>   client-side javascript line-level filtering.  The logstash script may
>   need to do its own filtering since it won't be running a javascript
>   interpreter, but it could probably still do so based on metadata
>   encoded into the HTML by the pre-processor.  Old logs won't benefit
>   from new features in the pre-processor though (unless we really feel
>   like batch-reprocessing).
>
> I think the choices of 1c and 2b get us out of the business of running
> log servers altogether and moves all the logic and processing to the
> edges.  I'm leaning toward them for that reason.
>
I agree about 2b, we can push specialized filtering into the places
that need it if necessary rather than having a catch all centralized
system. I am not quite sold on 1c though. If there is ever a need to
go back in time and find files from a particular time range and
project we would have to rely on Gerrit comments as an index which
seems less than ideal. Or we would have to do something with the tools
swift provides. Swift does allow for the attachment of arbitrary meta
data on objects, but doesn't appear to support an easy way for using
that info as an index (or filter). The more I think about a pure swift
solution the more I like it (someone else can deal with the hard
problems), but I do think we need to consider recording some index
that isn't Gerrit.

If a pure swift solution isn't doable what about a simple transaction
log that is recorded on disk or in a DB? We wouldn't need to expose
this to everyone, but having a record that maps build info to log
objects would be handy especially if parsing it doesn't require access
to Gerrit comments or the Gerrit DB. (Though this may be of minimal
value as Gerrit does provide a simple map for us).

Clark



More information about the OpenStack-Infra mailing list