[OpenStack-Infra] Log storage/serving

Joshua Hesketh joshua.hesketh at rackspace.com
Tue Sep 17 00:22:55 UTC 2013


So if zuul dictates where a log goes and we place the objects in swift 
with that path (change / patchset / pipeline / job / run) then zuul 
could also handle placing indexes as it should know which objects to expect.

That said, if the path is deterministic (such as that) and the workers 
provide the index for a run then I'm not sure how useful an index for 
patchsets would be. I'd be interested to know if anybody uses the link 
http://logs.openstack.org/34/45334/ without having come from gerrit or 
another source where it is published. Because of its deterministic 
nature perhaps the use case where it is needed could be served otherhow?

Cheers,
Josh

--
Rackspace Australia

On 9/13/13 2:49 AM, James E. Blair wrote:
> Joshua Hesketh <joshua.hesketh at rackspace.com> writes:
>
>> We could then use either psuedo folders[0] or have the worker generate
>> an index. For example, why not create an index object with links to
>> the other objects (using the known serving application URL prepended)?
>> In fact, the reporter can choose whether to generate an index file or
>> just send the psuedo folder to be served up.
> This is one of the main reasons we don't use swift today.  Consider this
> directory:
>
> http://logs.openstack.org/34/45334/
>
> It contains all of the runs of all of the jobs for all of the patchsets
> for change 45334.  That's very useful for discoverability; the
> alternative is to read the comments in gerrit and examine the links
> one-by-one.  A full-depth example:
>
> http://logs.openstack.org/34/45334/7/check/gate-zuul-python27/7c48ee3/
>
> (That's change / patchset / pipeline / job / run.)
>
> Each individual job is concerned with only the last component of that
> hierarchy, and has no knowledge of what other related jobs may have run
> before or will run after, so creating an index under those circumstances
> is difficult.  Moreover, if you consider that in the new system, we
> won't be able to trust a job with access to any pseudo-directory level
> higher than its individual run, there is actually no way for it to
> create any of the higher-level indexes.
>
> If we want to maintain that level of discoverability, then I think we
> need something outside of the job to create indexes (in my earlier
> email, the artifact-serving component does this).  If we are okay losing
> that, then yes, we can just sort of shove everything related to a run
> into a certain arbitrary location whose path won't be important anymore.
> Within the area written to by a single run, however, we may still have
> subdirectories.  Whether and how to create swift directory markers for
> those is still an issue (see my other email).  But perhaps they are not
> necessary, and yes, certainly _within the directory for a run_, we could
> create index files for as needed.
>
> Note the following implementation quirks we have observed:
>
>   * Rackspace does not perform autoindex-like functionality for directory
>     markers unless you are using the CDN (which has its own complications
>     related to cache timeouts, dns hostnames, etc).
>
>   * HPCloud does not recognize directory markers when generating index
>     pages for the public view of containers.
>
> We may want and indeed be able to use the staticweb feature, along with
> the CDN -- but there's enough complication here that we'll need to get
> fairly detailed in the design and validate our assumptions.
>
> -Jim
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra




More information about the OpenStack-Infra mailing list