[OpenStack-Infra] Log storage/serving

Monty Taylor mordred at inaugust.com
Thu Oct 10 18:26:53 UTC 2013



On 10/10/2013 02:06 PM, Clark Boylan wrote:
> On Thu, Oct 10, 2013 at 10:42 AM, James E. Blair <jeblair at openstack.org> wrote:
>> Joshua Hesketh <joshua.hesketh at rackspace.com> writes:
>>
>>> On 9/25/13 2:47 AM, James E. Blair wrote:
>>>> Joshua Hesketh <joshua.hesketh at rackspace.com> writes:
>>>>
>>>>> On 9/17/13 11:00 PM, Monty Taylor wrote:
>>>>>> On 09/16/2013 07:22 PM, Joshua Hesketh wrote:
>>>>>>> So if zuul dictates where a log goes and we place the objects in swift
>>>>>>> with that path (change / patchset / pipeline / job / run) then zuul
>>>>>>> could also handle placing indexes as it should know which objects to
>>>>>>> expect.
>>>>>>>
>>>>>>> That said, if the path is deterministic (such as that) and the workers
>>>>>>> provide the index for a run then I'm not sure how useful an index for
>>>>>>> patchsets would be. I'd be interested to know if anybody uses the link
>>>>>>> http://logs.openstack.org/34/45334/ without having come from gerrit or
>>>>>>> another source where it is published.
>>>>>> https://pypi.python.org/pypi/git-os-job
>>>>> Right, but that calculates the path (as far as I can see) so we
>>>>> therefore still don't necessarily need indexes generated.
>>>> The final portion of the URL, signifying the run, is effectively random.
>>>> So that tool actually relies on a one-level-up index page.  (That tool
>>>> works on post jobs rather than check or gate, but the issues are
>>>> similar).
>>>
>>> So two questions,
>>> 1) Do we need a random job run? is it for debugging or something? And
>>> if so, can we provide it another way.
>>
>> I don't understand this question -- are you asking "does anyone need to
>> access a run other than the one left as a comment in gerrit?"  That's
>> answered in my text you quoted below.
>>
>>> 2) What if the tool provided the index for its runs?
>>
>> I think we agree that would be fairly easy, at least starting from the
>> point of the individual run and working down the tree.  I think it's the
>> indexes of runs that complicates this.
>>
>>>>
>>>> Other than that, most end users do not use indexes outside of the
>>>> particular job run, and that's by design.  We try to put the most useful
>>>> URL in the message that is left in Gerrit.
>>>>
>>>> However, those of us working on the infrastructure itself, or those
>>>> engaged in special projects (such as mining old test logs), or even the
>>>> occasional person curious about whether the problem they are seeing was
>>>> encountered in _all_ runs of a test find the ability to locate logs from
>>>> any run _very_ useful.  If we lost that ability, we would literally have
>>>> no way to locate any logs other than the 'final' logs of a run, and
>>>> those only through the comment left in Gerrit, due to the issue
>>>> mentioned above.
>>>>
>>>> We can discuss doing that, but it would be a huge change from our
>>>> current practice.
>>> Yep, I'm convinced that the logs need to be accessible.
>>
>> Okay, let me try to summarize current thinking:
>>
>> * We want to try to avoid writing a tool that receives logs because
>>   swift provides most/all of the needed functionality.
>>   * The swift tempurl middleware will allow us to have the client
>>     directly PUT files in swift using a HMAC signed token.
>>   * This means any pre-processing of logs would need to happen with the
>>     log-uploading-client or via some unspecified event trigger.
>>
>> * We may or may not want a log-serving app.
>>   * We're doing neat things like filtering on level and html-ifying logs
>>     as we serve them with our current log-serving app.
>>   * We could probably do that processing pre-upload (including embedding
>>     javascript in html pages to do the visual filtering) and then we
>>     could serve static pages instead.
>>   * A log serving app may be required to provide some kinds of indexes.
>>
>> So to decide on the log-serving app, we need to figure out:
>>
>> 1) What do we want out of indexes?
>>
>> Let's take a current example log path:
>>
>>   http://logs.openstack.org/95/50795/4/check/check-grenade-devstack-vm/3c17e3c/console.html
>>
>> Ignoring the change[:-2] at the beginning since it's an implementation
>> artifact, that's basically:
>>
>>   /change/patchset/pipeline/job/run[random]/
>>
>> The upload script can easily handle creating index pages below that
>> point.  But since it runs in the context of a job run, it can't create
>> index pages above that (besides the technical difficulty, we don't want
>> to give it permission outside of its run anyway).  So I believe that
>> without a log-receiving app, our only options are:
>>
>>   a) Use the static web swift middleware to provide indexes.  Due to the
>>   intersection of this feature, CDN, and container sizes with our
>>   current providers, this is complicated and we end up at a dead end
>>   every time we talk through it.
>>
>>   b) Use a log-serving application to generate index pages where we need
>>   them.  We could do this by querying swift.  If we eliminate the
>>   ability to list ridiculously large indexes (like all changes, etc) and
>>   restrict it down to the level of, say, a single change, then this
>>   might be manageable.  However, swift may still have to perform a large
>>   query to get us down to that level.
>>
>>   c) Reduce the discoverability of test runs.  We could actually just
>>   collapse the whole path into a random string and leave that as a
>>   comment in Gerrit.  Users would effectively never be able to discover
>>   any runs other than the final ones that are reported in Gerrit, and
>>   even comparing runs for different patchsets would involve looking up
>>   the URL for each in the respective Gerrit comments.  Openstack-infra
>>   tools, such as elastic-recheck, could still discover other runs by
>>   watching for ZMQ or Gearman events.
>>
>>   This would make little difference to most end-users as well as project
>>   tooling, but it would make it a little harder to develop new project
>>   tooling without access to that event stream.
>>
>> Honestly, option C is growing on me, but I'd like some more feedback on
>> that.
>>
>> 2) What do we want out of processing?
>>
>> Currently we HTMLify and filter logs by log level at run-time when
>> serving them.  I think our choices are:
>>
>>   a) Continue doing this -- this requires a log-serving app that will
>>   fetch logs from swift, process them, and serve them.
>>
>>   b) Pre-process logs before uploading them.  HTMLify and add
>>   client-side javascript line-level filtering.  The logstash script may
>>   need to do its own filtering since it won't be running a javascript
>>   interpreter, but it could probably still do so based on metadata
>>   encoded into the HTML by the pre-processor.  Old logs won't benefit
>>   from new features in the pre-processor though (unless we really feel
>>   like batch-reprocessing).
>>
>> I think the choices of 1c and 2b get us out of the business of running
>> log servers altogether and moves all the logic and processing to the
>> edges.  I'm leaning toward them for that reason.
>>
> I agree about 2b, we can push specialized filtering into the places
> that need it if necessary rather than having a catch all centralized
> system. I am not quite sold on 1c though. If there is ever a need to
> go back in time and find files from a particular time range and
> project we would have to rely on Gerrit comments as an index which
> seems less than ideal. Or we would have to do something with the tools
> swift provides. Swift does allow for the attachment of arbitrary meta
> data on objects, but doesn't appear to support an easy way for using
> that info as an index (or filter). The more I think about a pure swift
> solution the more I like it (someone else can deal with the hard
> problems), but I do think we need to consider recording some index
> that isn't Gerrit.

2b++

1c - I think I'm pretty sold on as well. However, on what clark is
saying, if we put the metadata for a run that _would_ be used for an
index into a swift object, then we could, if we ever wanted one, write
a log index serving app. It would be full of expensive operations - but
I think our need for it would be low. We could also write it later as
needed.

> If a pure swift solution isn't doable what about a simple transaction
> log that is recorded on disk or in a DB? We wouldn't need to expose
> this to everyone, but having a record that maps build info to log
> objects would be handy especially if parsing it doesn't require access
> to Gerrit comments or the Gerrit DB. (Though this may be of minimal
> value as Gerrit does provide a simple map for us).
> 
> Clark
> 
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
> 



More information about the OpenStack-Infra mailing list