[openstack-dev] Many timeouts in zuul gates for TripleO
whayutin at redhat.com
Fri Jan 19 23:45:42 UTC 2018
On Fri, Jan 19, 2018 at 12:23 PM, Ben Nemec <openstack at nemebean.com> wrote:
> On 01/18/2018 09:45 AM, Emilien Macchi wrote:
>> On Thu, Jan 18, 2018 at 6:34 AM, Or Idgar <oidgar at redhat.com> wrote:
>>> we're encountering many timeouts for zuul gates in TripleO.
>>> For example, see
>>> rechecks won't help and sometimes specific gate is end successfully and
>>> sometimes not.
>>> The problem is that after recheck it's not always the same gate which is
>>> Is there someone who have access to the servers load to see what cause
>>> alternatively, is there something we can do in order to reduce the
>>> time for each gate?
>> We're migrating to RDO Cloud for OVB jobs:
>> It's a work in progress but will help a lot for OVB timeouts on RH1.
>> I'll let the CI folks comment on that topic.
> I noticed that the timeouts on rh1 have been especially bad as of late so
> I did a little testing and found that it did seem to be running more slowly
> than it should. After some investigation I found that 6 of our compute
> nodes have warning messages that the cpu was throttled due to high
> temperature. I've disabled 4 of them that had a lot of warnings. The other
> 2 only had a handful of warnings so I'm hopeful we can leave them active
> without affecting job performance too much. It won't accomplish much if we
> disable the overheating nodes only to overload the remaining ones.
> I'll follow up with our hardware people and see if we can determine why
> these specific nodes are overheating. They seem to be running 20 degrees C
> hotter than the rest of the nodes.
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
For the latest discussion and to-do's before rh1 ovb jobs are migrated to
rdo-cloud look here .
TLDR is that we're looking for a run of seven days where the jobs are
passing at around 80% or better in check.
We've reported a number of issues w/ the environment, and AFAIK everything
is now resolved just recently.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev