[tripleo] CI

Wesley Hayutin whayutin at redhat.com
Fri May 1 15:00:14 UTC 2020


On Thu, Apr 30, 2020 at 6:37 PM Emilien Macchi <emilien at redhat.com> wrote:

> Thanks a lot Wes + team for the hard work, every day to keep CI stable.
>
> On Thu, Apr 30, 2020 at 7:02 PM Wesley Hayutin <whayutin at redhat.com>
> wrote:
>
>> Greetings,
>>
>> Status update...
>> Master: GREEN
>>
>> Stable Branches impacted by:
>> https://bugs.launchpad.net/tripleo/+bug/1875890 fixed
>> Now we are trying to promote each branch to level out pacemaker on the
>> node and containers.  Queens is promoting now.
>>
>> Train: GREEN
>>
>> Queens:  RED ( current priority )
>> In addition to the pacemaker issue which has resolved in our periodic
>> testing jobs, we're hitting issues w/ instances timing out in tempest
>> https://bugs.launchpad.net/tripleo/+bug/1876087
>>
>> Stein: RED
>> Also seems to have the same issue as Queens
>> https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation
>> for Stein )
>>
>> Rocky: RED
>> Also seems to have the same issue as Queens
>> https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation
>> for Rocky )
>> I will be promoting Rocky to level out pacemaker next.
>>
>> In order to get the voting jobs in queens back to green, I'm disabling
>> tempest on containers-multinode. https://review.opendev.org/#/c/724703/
>>
>> Additional notes may be found:
>> https://hackmd.io/1pY-KQB_QwOe-a-5oEXTRg
>>
>>
>
> --
> Emilien Macchi
>

Status Update:

Master: Green, but seeing several jobs failing on random tempest or
container start issues atm. No pattern yet

Train: Green

Stein: Green

Rocky: Green

Queens: Green, the coverage here on scenario jobs is terrible as they are
all failing.  There is some discrepancy between periodic and check as
periodic only went red on April 26th [1] vs. check jobs started going red
on March 24th [2]

Improvements in progress:
At the moment we test CentOS CR [3] in our RDO periodic pipelines which is
NOT sufficient to protect upstream jobs. It would not catch a pacemaker
mismatch between the nodepool node and containers.  Containers are rebuilt
for each test in RDO.

To help catch issues w/ the latest CentOS packages in CR we are making the
following changes to our upstream periodic jobs.  Gabriele Cerami and
myself are working through the design now.  Input is welcome.

TLDR: Using the upstream zuul ensures that the containers and nodes have
the potential to mismatch in versions and thus catching pacemaker issues in
advance.
https://review.opendev.org/#/c/724846/
https://review.opendev.org/#/c/724858/

Thanks all

[1]
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-wednesday-weekend&job_name=periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-queens%09
[2]
https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-scenario001-multinode-oooq-container&branch=stable%2Fqueens
[3] https://wiki.centos.org/AdditionalResources/Repositories/CR



Thanks Emilien!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200501/208431b6/attachment.html>


More information about the openstack-discuss mailing list