[openstack-dev] [tripleo] tripleo upstream gate outtage, was: -> gate jobs impacted RAX yum mirror
cboylan at sapwetik.org
Mon May 14 18:05:29 UTC 2018
On Mon, May 14, 2018, at 10:11 AM, Wesley Hayutin wrote:
> On Mon, May 14, 2018 at 12:08 PM Clark Boylan <cboylan at sapwetik.org> wrote:
> > On Mon, May 14, 2018, at 8:57 AM, Wesley Hayutin wrote:
> > > On Mon, May 14, 2018 at 10:36 AM Jeremy Stanley <fungi at yuggoth.org>
> > wrote:
> > >
> > > > On 2018-05-14 07:07:03 -0600 (-0600), Wesley Hayutin wrote:
> > > > Our automation doesn't know that there's a difference between
> > > > packages which were part of CentOS 7.4 and 7.5 any more than it
> > > > knows that there's a difference between Ubuntu 16.04.2 and 16.04.3.
> > > > Even if we somehow managed to pause our CentOS image updates
> > > > immediately prior to 7.5, jobs would still try to upgrade those
> > > > 7.4-based images to the 7.5 packages in our mirror, right?
> > > >
> > >
> > > Understood, I suspect this will become a more widespread issue as
> > > more projects start to use containers ( not sure ). It's my
> > understanding
> > > that
> > > there are some mechanisms in place to pin packages in the centos nodepool
> > > image so
> > > there has been some thoughts generally in the area of this issue.
> > Again, I think we need to understand why containers would make this worse
> > not better. Seems like the big feature everyone talks about when it comes
> > to containers is isolating packaging whether that be python packages so
> > that nova and glance can use a different version of oslo or cohabitating
> > software that would otherwise conflict. Why do the packages on the host
> > platform so strongly impact your container package lists?
> I'll let others comment on that, however my thought is you don't move from
> A -> Z in one step and containers do not make everything easier
> immediately. Like most things, it takes a little time.
If the main issue is being caught in a transition period at the same time a minor update happens can we treat this as a temporary state? Rather than attempting to for solve this particular case happening again the future we might be better served testing that upcoming CentOS releases won't break tripleo due to changes in the packaging using the centos-release-cr repo as Tristan suggests. That should tell you if something like pacemaker were to stop working. Note this wouldn't require any infra side updates, you would just have these jobs configure the additional repo and go from there.
Then on top of that get through the transition period so that the containers isolate you from these changes in the way they should. Then when 7.6 happens you'll have hopefully identified all the broken packaging ahead of time and worked with upstream to address those problems (which should be important for a stable long term support distro) and your containers can update at whatever pace they choose?
I don't think it would be appropriate for Infra to stage centos minor versions for a couple reasons. The first is we don't support specific minor versions of CentOS/RHEL, we support the major version and if it updates and OpenStack stops working that is CI doing its job and providing that info. The other major concern is CentOS specifically says "We are trying to make sure people understand they can NOT use older minor versions and still be secure." Similarly to how we won't support Ubuntu 12.04 because it is no longer supported we shouldn't support CentOS 7.4 at this point. These are no longer secure platforms.
However, I think testing using the pre release repo as proposed above should allow you to catch issues before updates happen just as well as a staged minor version update would. The added benefit of using this process is you should know as soon as possible and not after the release has been made (helping other users of CentOS by not releasing broken packages in the first place).
More information about the OpenStack-dev