Greetings, Status update... Master: GREEN Stable Branches impacted by: https://bugs.launchpad.net/tripleo/+bug/1875890 fixed Now we are trying to promote each branch to level out pacemaker on the node and containers. Queens is promoting now. Train: GREEN Queens: RED ( current priority ) In addition to the pacemaker issue which has resolved in our periodic testing jobs, we're hitting issues w/ instances timing out in tempest https://bugs.launchpad.net/tripleo/+bug/1876087 Stein: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Stein ) Rocky: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Rocky ) I will be promoting Rocky to level out pacemaker next. In order to get the voting jobs in queens back to green, I'm disabling tempest on containers-multinode. https://review.opendev.org/#/c/724703/ Additional notes may be found: https://hackmd.io/1pY-KQB_QwOe-a-5oEXTRg
Thanks a lot Wes + team for the hard work, every day to keep CI stable. On Thu, Apr 30, 2020 at 7:02 PM Wesley Hayutin <whayutin@redhat.com> wrote:
Greetings,
Status update... Master: GREEN
Stable Branches impacted by: https://bugs.launchpad.net/tripleo/+bug/1875890 fixed Now we are trying to promote each branch to level out pacemaker on the node and containers. Queens is promoting now.
Train: GREEN
Queens: RED ( current priority ) In addition to the pacemaker issue which has resolved in our periodic testing jobs, we're hitting issues w/ instances timing out in tempest https://bugs.launchpad.net/tripleo/+bug/1876087
Stein: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Stein )
Rocky: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Rocky ) I will be promoting Rocky to level out pacemaker next.
In order to get the voting jobs in queens back to green, I'm disabling tempest on containers-multinode. https://review.opendev.org/#/c/724703/
Additional notes may be found: https://hackmd.io/1pY-KQB_QwOe-a-5oEXTRg
-- Emilien Macchi
On Thu, Apr 30, 2020 at 6:37 PM Emilien Macchi <emilien@redhat.com> wrote:
Thanks a lot Wes + team for the hard work, every day to keep CI stable.
On Thu, Apr 30, 2020 at 7:02 PM Wesley Hayutin <whayutin@redhat.com> wrote:
Greetings,
Status update... Master: GREEN
Stable Branches impacted by: https://bugs.launchpad.net/tripleo/+bug/1875890 fixed Now we are trying to promote each branch to level out pacemaker on the node and containers. Queens is promoting now.
Train: GREEN
Queens: RED ( current priority ) In addition to the pacemaker issue which has resolved in our periodic testing jobs, we're hitting issues w/ instances timing out in tempest https://bugs.launchpad.net/tripleo/+bug/1876087
Stein: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Stein )
Rocky: RED Also seems to have the same issue as Queens https://bugs.launchpad.net/tripleo/+bug/1876087 ( under investigation for Rocky ) I will be promoting Rocky to level out pacemaker next.
In order to get the voting jobs in queens back to green, I'm disabling tempest on containers-multinode. https://review.opendev.org/#/c/724703/
Additional notes may be found: https://hackmd.io/1pY-KQB_QwOe-a-5oEXTRg
-- Emilien Macchi
Status Update: Master: Green, but seeing several jobs failing on random tempest or container start issues atm. No pattern yet Train: Green Stein: Green Rocky: Green Queens: Green, the coverage here on scenario jobs is terrible as they are all failing. There is some discrepancy between periodic and check as periodic only went red on April 26th [1] vs. check jobs started going red on March 24th [2] Improvements in progress: At the moment we test CentOS CR [3] in our RDO periodic pipelines which is NOT sufficient to protect upstream jobs. It would not catch a pacemaker mismatch between the nodepool node and containers. Containers are rebuilt for each test in RDO. To help catch issues w/ the latest CentOS packages in CR we are making the following changes to our upstream periodic jobs. Gabriele Cerami and myself are working through the design now. Input is welcome. TLDR: Using the upstream zuul ensures that the containers and nodes have the potential to mismatch in versions and thus catching pacemaker issues in advance. https://review.opendev.org/#/c/724846/ https://review.opendev.org/#/c/724858/ Thanks all [1] https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-wednesday-weekend&job_name=periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-queens%09 [2] https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-scenario001-multinode-oooq-container&branch=stable%2Fqueens [3] https://wiki.centos.org/AdditionalResources/Repositories/CR Thanks Emilien!
participants (2)
-
Emilien Macchi
-
Wesley Hayutin