[openstack-dev] [gate] job failure rate at ~ 12% (check queue) <= issue?

Markus Zoeller mzoeller at de.ibm.com
Fri Dec 18 09:23:41 UTC 2015


Sean Dague <sean at dague.net> wrote on 12/17/2015 04:48:17 PM:

> From: Sean Dague <sean at dague.net>
> To: openstack-dev at lists.openstack.org
> Date: 12/17/2015 04:48 PM
> Subject: Re: [openstack-dev] [gate] job failure rate at ~ 12% (check 
> queue) <= issue?
> 
> On 12/17/2015 05:52 AM, Markus Zoeller wrote:
> > The job failure rates had an unusual rise at 06:30 UTC this morning 
[1].
> > I couldn't figure out if this is a real issue or somewhat related to
> > the gerrit update ~ 18 hours ago. The only thing I found was a time
> > frame of ~ 1h where the jobs failed to update the apt repos [2]. As
> > this issue is not present anymore in logstash, I expected that the job
> > failure rate would drop, but that didn't happen. Long story short,
> > do we have an issue? Or is this the aftermath of bug 1526675? 
> > 
> > [1] http://grafana.openstack.org/dashboard/db/tempest-failure-rate
> > [2] logstash query: http://bit.ly/1O8qjtn
> > 
> > Regards, Markus Zoeller (markus_z)
> 
> That graph is a pretty narrow time slice. What's the rolling average on
> that?
> 
>    -Sean
> 
> -- 
> Sean Dague
> http://dague.net

 
If I get my math right, the averages are:

                                            <30 days  <7 days  <2 days
    ------------------------------------------------------------------
    gate-tempest-dsvm-full (check)              ~10%     ~14%      ~7%
    gate-tempest-dsvm-neutron-full (check)      ~11%     ~16%      ~9%
    gate-grenade-dsvm (check)                    ~8%     ~15%      ~7%
 
I guess this means the value I observed is within the expected range and
there is no issue. I'm going to take that into account when I try to 
interpret the dashboard in the future, thanks.

Regards, Markus Zoeller (markus_z)




More information about the OpenStack-dev mailing list