[openstack-dev] [TripleO][CI] Elastic recheck bugs
Clark Boylan
cboylan at sapwetik.org
Mon May 9 19:32:30 UTC 2016
On Mon, May 9, 2016, at 10:22 AM, Sagi Shnaidman wrote:
> Hi, all
>
> I'd like to enable elastic recheck on TripleO CI and have submitted
> patches
> for refreshing the tracked logs [1] (please review) and for timeout case
> [2].
> But according to Derek's comment behind the timeout issue could be
> multiple
> issues and bugs, so I'd like to clarify - what are criteria for elastic
> recheck bugs?
>
> I thought about those markers:
>
> Nova:
> 1) "No valid host was found. There are not enough hosts"
> Network issues:
> 2) "Failed to connect to trunk.rdoproject.org" OR "fatal: The remote end
> hung up unexpectedly" OR "Could not resolve host:"
> Ironic:
> 3) "Error contacting Ironic server:"
> 4) "Introspection completed with errors:"
> 5) ": Introspection timeout"
> 6) "Timed out waiting for node "
> Glance:
> 7) "500 Internal Server Error: Failed to upload image"
> crm_resource:
> 8) "crm_resource for openstack "
>
> and various puppet errors.
>
> However almost all of these messages could have different root causes,
> except of network failures. Easy to fix bug doesn't make to submit there,
> because they will be fixed yet before recheck patch will be merged.
> So, could you please think about right criteria of bugs for elastic
> recheck?
I have reviewed the change to index these logs, you need to create log
files that are indexable before you can index them. Timestamps are a
huge thing here as we use delayed processing of the log files.
As far as criteria for elastic recheck bugs you are correct, the
preference is to identify specific failures in the service logs. For
example 500 internal server error: Failed to upload image is what the
client sees but the server should specifically log what the error was in
the service logs. The elastic recheck bug should be based on those
service logs.
Clark
More information about the OpenStack-dev
mailing list