Open Stack

Tue Aug 27 17:47:05 UTC 2013

On Tue, Aug 27, 2013 at 10:15 AM, Clint Byrum <clint at fewbar.com> wrote:
> Excerpts from John Griffith's message of 2013-08-27 09:42:37 -0700:
>> On Tue, Aug 27, 2013 at 10:26 AM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>>
>> > I wonder if there's any sort of automation we can apply to this, for
>> > example having known rechecks have "signatures" and if a failure matches
>> > the signature it auto applies the recheck.
>> >
>>
>> I think we kinda already have that, the recheck list and the bug ID
>> assigned to it no?  Automatically scanning said list and doing the recheck
>> automatically seems like overkill in my opinion.  At some point human
>> though/interaction is required and I don't think it's too much to ask a
>> technical contributor to simply LOOK at the output from the test runs
>> against their patches and help out a bit. At the very least if you didn't
>> test your patch yourself and waited for Jenkins to tell you it's broken I
>> would hope that a submitter would at least be motivated to fix their own
>> issue that they introduced.
>>
>
> It is worth thinking about though, because "ask a technical contributor
> to simply LOOK" is a lot more expensive than "let a script confirm the
> failure and tack it onto the list for rechecks".
>
> Ubuntu has something like this going for all of their users and it is
> pretty impressive.
>
> Apport and/or whoopsie see crashes and look at the
> backtraces/coredumps/etc and then (with user permission) submit a
> signature to the backend. It is then analyzed and the result is this:
>
> http://errors.ubuntu.com/
>
> Known false positives are shipped along side packages so that they do
> not produce noise, and known points of pain for debugging are eased by
> including logs and other things in bug reports when users are running
> the dev release. This results in a much better metric for what bugs to
> address first. IIRC update-manager also checks in with a URL that is
> informed partially by this data about whether or not to update packages,
> so if there is a high fail rate early on, the server side will basically
> signal update-manager "don't update right now".
>
> I'd love to see our CI system enhanced to do all of the pattern
> matching to group failures by common patterns, and then when a technical
> contributor looks at these groups they have tons of data points to _fix_
> the problem rather than just spending their precious time identifying it.
>
> The point of the recheck system, IMHO, isn't to make running rechecks
> easier, it is to find and fix bugs.
>
This is definitely worth thinking about and we had a session on
dealing with CI logs to do interesting things like update bugs and
handle rechecks automatically at the Havana summit[0]. Since then we
have built a logstash + elasticsearch system[1] that filters many of
our test logs and indexes a subset of what was filtered (typically
anything with a log level greater than DEBUG). Building this system is
step one in being able to detect anomalous logs, update bugs, and
potentially perform automatic rechecks with the appropriate bug.
Progress has been somewhat slow, but the current setup should be
mostly stable. If anyone is interested in poking at these tools to do
interesting automation with them feel free to bug the Infra team.

That said, we won't have something super automagic like that before
the end of Havana making John's point an important one. If previous
release feature freezes are any indication we will continue to put
more pressure on the CI system as we near Havana's feature freeze. Any
unneeded rechecks or reverifies can potentially slow the whole process
down for everyone. We should be running as many tests as possible
locally before pushing to Gerrit (this is as simple as running `tox`)
and making a best effort to identify the bugs that cause failures when
performing rechecks or reverifies.

[0] https://etherpad.openstack.org/havana-ci-logging
[1] http://ci.openstack.org/logstash.html

Thank you,
Clark

Open Stack

[openstack-dev] [OpenStack-dev] Rechecks and Reverifies

OpenStack

Community

Documentation

Branding & Legal