[openstack-dev] [tripleo] tripleo upstream gate outtage, was: -> gate jobs impacted RAX yum mirror

Clark Boylan cboylan at sapwetik.org
Mon May 14 16:08:18 UTC 2018

On Mon, May 14, 2018, at 8:57 AM, Wesley Hayutin wrote:
> On Mon, May 14, 2018 at 10:36 AM Jeremy Stanley <fungi at yuggoth.org> wrote:
> > On 2018-05-14 07:07:03 -0600 (-0600), Wesley Hayutin wrote:
> > [...]


> >
> > This _doesn't_ sound to me like a problem with how we've designed
> > our infrastructure, unless there are additional details you're
> > omitting.
> So the only thing out of our control is the package set on the base
> nodepool image.
> If that suddenly gets updated with too many packages, then we have to
> scramble to ensure the images and containers are also udpated.
> If there is a breaking change in the nodepool image for example [a], we
> have to react to and fix that as well.

Aren't the container images independent of the hosting platform (eg what infra hosts)? I'm not sure I understand why the host platform updating implies all the container images must also be updated.

> > It sounds like a problem with how the jobs are designed
> > and expectations around distros slowly trickling package updates
> > into the series without occasional larger bursts of package deltas.
> > I'd like to understand more about why you upgrade packages inside
> > your externally-produced container images at job runtime at all,
> > rather than relying on the package versions baked into them.
> We do that to ensure the gerrit review itself and it's dependencies are
> built via rpm and injected into the build.
> If we did not do this the job would not be testing the change at all.
>  This is a result of being a package based deployment for better or worse.

You'd only need to do that for the change in review, not the entire system right?



> > Our automation doesn't know that there's a difference between
> > packages which were part of CentOS 7.4 and 7.5 any more than it
> > knows that there's a difference between Ubuntu 16.04.2 and 16.04.3.
> > Even if we somehow managed to pause our CentOS image updates
> > immediately prior to 7.5, jobs would still try to upgrade those
> > 7.4-based images to the 7.5 packages in our mirror, right?
> >
> Understood, I suspect this will become a more widespread issue as
> more projects start to use containers ( not sure ).  It's my understanding
> that
> there are some mechanisms in place to pin packages in the centos nodepool
> image so
> there has been some thoughts generally in the area of this issue.

Again, I think we need to understand why containers would make this worse not better. Seems like the big feature everyone talks about when it comes to containers is isolating packaging whether that be python packages so that nova and glance can use a different version of oslo or cohabitating software that would otherwise conflict. Why do the packages on the host platform so strongly impact your container package lists?

> TripleO may be the exception to the rule here and that is fine, I'm more
> interested in exploring
> the possibilities of delivering updates in a staged fashion than anything.
> I don't have insight into
> what the possibilities are, or if other projects have similiar issues or
> requests.  Perhaps the TripleO
> project could share the details of our job workflow with the community and
> this would make more sense.
> I appreciate your time, effort and thoughts you have shared in the thread.
> > --
> > Jeremy Stanley
> >
> [a] https://bugs.launchpad.net/tripleo/+bug/1770298

I think understanding the questions above may be the important aspect of understanding what the underlying issue is here and how we might address it.


More information about the OpenStack-dev mailing list