[openstack-dev] [tripleo] tripleo upstream gate outtage, was: -> gate jobs impacted RAX yum mirror

Wesley Hayutin whayutin at redhat.com
Mon May 14 02:44:25 UTC 2018


On Sun, May 13, 2018 at 11:25 AM Jeremy Stanley <fungi at yuggoth.org> wrote:

> On 2018-05-13 08:25:25 -0600 (-0600), Wesley Hayutin wrote:
> [...]
> > We need to in coordination with the infra team be able to pin / lock
> > content for production check and gate jobs while also have the ability to
> > stage new content e.g. centos 7.5 with experimental or periodic jobs.
> [...]
>
> It looks like adjustments would be needed to DIB's centos-minimal
> element if we want to be able to pin it to specific minor releases.
> However, having to rotate out images in the fashion described would
> be a fair amount of manual effort and seems like it would violate
> our support expectations in governance if we end up pinning to older
> minor versions (for major LTS versions on the other hand, we expect
> to undergo this level of coordination but they come at a much slower
> pace with a lot more advance warning). If we need to add controlled
> roll-out of CentOS minor version updates, this is really no better
> than Fedora from the Infra team's perspective and we've already said
> we can't make stable branch testing guarantees for Fedora due to the
> complexity involved in using different releases for each branch and
> the need to support our stable branches longer than the distros are
> supporting the releases on which we're testing.
>

This is good insight Jeremy, thanks for replying.



>
> For example, how long would the distro maintainers have committed to
> supporting RHEL 7.4 after 7.5 was released? Longer than we're
> committing to extended maintenance on our stable/queens branches? Or
> would you expect projects to still continue to backport support for
> these minor platform bumps to all their stable branches too? And
> what sort of grace period should we give them before we take away
> the old versions? Also, how many minor versions of CentOS should we
> expect to end up maintaining in parallel? (Remember, every
> additional image means that much extra time to build and upload to
> all our providers, as well as that much more storage on our builders
> and in our Glance quotas.)
> --
> Jeremy Stanley
>

I think you may be describing a level of support that is far greater than
what I was thinking. I also don't want to tax the infra team w/ n+ versions
of the baseos to support.
I do think it would be helpful to say have a one week change window where
folks are given the opportunity to preflight check a new image and the
potential impact on the job workflow the updated image may have.  If I
could update or create a non-voting job w/ the new image that would provide
two things.

1. The first is the head's up, this new minor version of centos is coming
into the system and you have $x days to deal with it.
2. The ability to build a few non-voting jobs w/ the new image to see what
kind of impact it has on the workflow and deployments.

In this case the updated 7.5 CentOS image worked fine w/ TripleO, however
it did cause our gates to go red because..
a. when we update containers w/ zuul dependendencies all the base-os
updates were pulled in and jobs timed out.
b. a kernel bug workaround with virt-customize failed to work due the
kernel packages changed ( 3rd party job )
c. the containers we use were not yet at CentOS 7.5 but the bm image was
causing issues w/ pacemaker.
d. there may be a few more that I am forgetting, but hopefully the point is
made.

We can fix a lot of the issues and I'm not blaming anyone because if we
(tripleo ) thought of all the corner cases with our workflow we would have
been able to avoid some of these issues.  However it does seem like we get
hit by $something every time we update a minor version of the baseos.  My
preference would be to have a heads up and work through the issues than to
go immediately red and unable to merge patches.  I don't know if other
teams get impacted in similiar ways, and I understand this is a big ship
and updating CentOS may work just fine for everyone else.

Thanks all for your time and effort!




> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180513/6a777c6a/attachment.html>


More information about the OpenStack-dev mailing list