[openstack-dev] Many timeouts in zuul gates for TripleO

Paul Belanger pabelanger at redhat.com
Sat Jan 20 02:38:47 UTC 2018


On Fri, Jan 19, 2018 at 11:23:45AM -0600, Ben Nemec wrote:
> 
> 
> On 01/18/2018 09:45 AM, Emilien Macchi wrote:
> > On Thu, Jan 18, 2018 at 6:34 AM, Or Idgar <oidgar at redhat.com> wrote:
> > > Hi,
> > > we're encountering many timeouts for zuul gates in TripleO.
> > > For example, see
> > > http://logs.openstack.org/95/508195/28/check-tripleo/tripleo-ci-centos-7-ovb-ha-oooq/c85fcb7/.
> > > 
> > > rechecks won't help and sometimes specific gate is end successfully and
> > > sometimes not.
> > > The problem is that after recheck it's not always the same gate which is
> > > failed.
> > > 
> > > Is there someone who have access to the servers load to see what cause this?
> > > alternatively, is there something we can do in order to reduce the running
> > > time for each gate?
> > 
> > We're migrating to RDO Cloud for OVB jobs:
> > https://review.openstack.org/#/c/526481/
> > It's a work in progress but will help a lot for OVB timeouts on RH1.
> > 
> > I'll let the CI folks comment on that topic.
> > 
> 
> I noticed that the timeouts on rh1 have been especially bad as of late so I
> did a little testing and found that it did seem to be running more slowly
> than it should.  After some investigation I found that 6 of our compute
> nodes have warning messages that the cpu was throttled due to high
> temperature.  I've disabled 4 of them that had a lot of warnings. The other
> 2 only had a handful of warnings so I'm hopeful we can leave them active
> without affecting job performance too much.  It won't accomplish much if we
> disable the overheating nodes only to overload the remaining ones.
> 
> I'll follow up with our hardware people and see if we can determine why
> these specific nodes are overheating.  They seem to be running 20 degrees C
> hotter than the rest of the nodes.
> 
Did tripleo-test-cloud-rh1 get new kernels applied for meltdown / spectre,
possible that is impacting performance too?

-Paul



More information about the OpenStack-dev mailing list