[dev][infra][tact-sig] Zuul 4.6.0 and associated job changes
Today at 14:00 UTC the OpenDev Collaboratory upgraded its deployment of Zuul, coinciding with the 4.6.0 security release. This release disabled or changed a number of features which could previously be leveraged to take over executors or obtain decrypted copies of secret data, necessitating adjustments to some jobs. I think we've now addressed the majority of the central job resources which were impacted, but there are almost certainly less-frequently-exercised jobs which are still configured to do things which will no longer work. There were likely some strange-looking failures, particularly in promote and post pipeline builds, between 15:00 and 19:00 UTC today, so if you need something rerun for any reason please do reach out. The two main categories of new bugs which will need fixing are: * Use of Jinja2 templating in secret definitions * Setting ansible_connection, ansible_host, ansible_python_interpreter, ansible_shell_executable, or ansible_user The full release notes can be found in the release announcement here: http://lists.zuul-ci.org/pipermail/zuul-announce/2021-June/000096.html If you run into a new problem in one of your jobs and you believe it may be related to the above or similar fallout from the changes in Zuul 4.6.0 and need assistance, please don't hesitate to contact the TaCT SIG in the #openstack-infra channel on the OFTC IRC network or by replying to this mailing list thread. Apologies for any disruption this update may have caused, and thanks for your understanding. -- Jeremy Stanley
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages). Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default. Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases. Thanks for your prompt attention! -- Jeremy Stanley
On Fri, Jun 25, 2021 at 10:19 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages).
It looks like puppet-openstack-integration stable/ocata and stable/pike needs to be cleaned up/removed. I don't see it as deliverables in the releases repo so these might have been manually created before moving under the release umbrella. I believe we've EOL'd pike and ocata for the regular modules. What would be the best course of action to clean up these branches? Thanks, -Alex
Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default.
Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases.
Thanks for your prompt attention! -- Jeremy Stanley
On Fri, Jun 25, 2021, at 9:39 AM, Alex Schultz wrote:
On Fri, Jun 25, 2021 at 10:19 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages).
It looks like puppet-openstack-integration stable/ocata and stable/pike needs to be cleaned up/removed. I don't see it as deliverables in the releases repo so these might have been manually created before moving under the release umbrella. I believe we've EOL'd pike and ocata for the regular modules. What would be the best course of action to clean up these branches?
For OpenStack release managed projects (I believe this is one) the OpenStack release teams has appropriate permissions in Gerrit as well as script tools to EOL branches properly. I think you can make a request to them and they can run through that for you. For projects that are not managed by the OpenStack release team we can help you update the Gerrit ACLs so that you have appropriate permissions for this type of cleanup. https://opendev.org/openstack/project-config/src/branch/master/gerrit/acls/o... shows the set of permissions needed to abandon all open changes on a branch, tag the branch with an eol tag, then remove the branch. (The create permission isn't strictly necessary here).
Thanks, -Alex
Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default.
Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases.
Thanks for your prompt attention! -- Jeremy Stanley
On 2021-06-25 10:39:25 -0600 (-0600), Alex Schultz wrote: [...]
It looks like puppet-openstack-integration stable/ocata and stable/pike needs to be cleaned up/removed. I don't see it as deliverables in the releases repo so these might have been manually created before moving under the release umbrella. I believe we've EOL'd pike and ocata for the regular modules. What would be the best course of action to clean up these branches? [...]
The OpenStack Release Managers have branch deletion access via the Gerrit WebUI and REST API, and have been performing scripted batch deletions of EOL branches for a little while now. These may already be slated for removal, but it can't hurt to confirm. -- Jeremy Stanley
The EOL tags for stable/ocata and pike for puppet-openstack were created a while ago[1] and these two branches are no longer maintained. Because eol tag was already created, we can remove the stable/ocata branch and the stable/pike branch from git repo and gerrit. I'll ask the Release Management team to delete these two branches, then I expect the current errors will be solved. (Sorry but it seems I forgot to ask the deletion when I proposed EOL) [1] https://review.opendev.org/c/openstack/releases/+/726392/ On Sat, Jun 26, 2021 at 1:57 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
On 2021-06-25 10:39:25 -0600 (-0600), Alex Schultz wrote: [...]
It looks like puppet-openstack-integration stable/ocata and stable/pike needs to be cleaned up/removed. I don't see it as deliverables in the releases repo so these might have been manually created before moving under the release umbrella. I believe we've EOL'd pike and ocata for the regular modules. What would be the best course of action to clean up these branches? [...]
The OpenStack Release Managers have branch deletion access via the Gerrit WebUI and REST API, and have been performing scripted batch deletions of EOL branches for a little while now. These may already be slated for removal, but it can't hurt to confirm. -- Jeremy Stanley
Regarding the error in puppet repos, it turned out that one repo(puppet-openstack-integration) has no deliverables for Pike and Ocata and because of that its stable/ocata and stable/pike have not yet been EOLed. (These two stable branches were EOLed in the other puppet repos) I've raised this in #openstack-release channel and will ask some help from the release team to move these two branches in p-o-i repo to EOL. On Sat, Jun 26, 2021 at 2:28 PM Takashi Kajinami <tkajinam@redhat.com> wrote:
The EOL tags for stable/ocata and pike for puppet-openstack were created a while ago[1] and these two branches are no longer maintained. Because eol tag was already created, we can remove the stable/ocata branch and the stable/pike branch from git repo and gerrit. I'll ask the Release Management team to delete these two branches, then I expect the current errors will be solved. (Sorry but it seems I forgot to ask the deletion when I proposed EOL)
[1] https://review.opendev.org/c/openstack/releases/+/726392/
On Sat, Jun 26, 2021 at 1:57 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
On 2021-06-25 10:39:25 -0600 (-0600), Alex Schultz wrote: [...]
It looks like puppet-openstack-integration stable/ocata and stable/pike needs to be cleaned up/removed. I don't see it as deliverables in the releases repo so these might have been manually created before moving under the release umbrella. I believe we've EOL'd pike and ocata for the regular modules. What would be the best course of action to clean up these branches? [...]
The OpenStack Release Managers have branch deletion access via the Gerrit WebUI and REST API, and have been performing scripted batch deletions of EOL branches for a little while now. These may already be slated for removal, but it can't hurt to confirm. -- Jeremy Stanley
Regarding neutron related errors, I see two patterns. The one is "Unknown projects: openstack/networking-l2gw". This is caused by the official retirement of networking-l2gw and openstack/networking-l2gw was dropped from zuul/main.yaml. We need to replace openstack/networking-l2gw with x/networking-l2gw. This happens in networking-odl, networking-midonet and neutron-fwaas. The other is "Job neutron-fwaas-networking-midonet-cross-py35 not defined". This happens in neutron-fwaas. I will take care of them. Thanks, Akihiro Motoki (amotoki) On Sat, Jun 26, 2021 at 1:14 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages).
Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default.
Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases.
Thanks for your prompt attention! -- Jeremy Stanley
Hi, On poniedziałek, 28 czerwca 2021 13:41:44 CEST Akihiro Motoki wrote:
Regarding neutron related errors, I see two patterns.
The one is "Unknown projects: openstack/networking-l2gw". This is caused by the official retirement of networking-l2gw and openstack/networking-l2gw was dropped from zuul/main.yaml. We need to replace openstack/networking-l2gw with x/networking-l2gw. This happens in networking-odl, networking-midonet and neutron-fwaas.
The other is "Job neutron-fwaas-networking-midonet-cross-py35 not defined". This happens in neutron-fwaas.
I will take care of them.
Thx a lot Akihiro for taking care of it. Please ping me when You will have something to review :)
Thanks, Akihiro Motoki (amotoki)
On Sat, Jun 26, 2021 at 1:14 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages).
Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default.
Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases.
Thanks for your prompt attention! -- Jeremy Stanley
-- Slawek Kaplonski Principal Software Engineer Red Hat
Regarding Barbican, I have submitted a series of patches to remove reference to the octavia-v1-dsvm-scenario job. https://review.opendev.org/q/topic:%22octavia-v1%22+(status:open%20OR%20stat...) It seems the job no longer exists since Octavia stable/stein was EOLed. On Sat, Jun 26, 2021 at 1:18 AM Jeremy Stanley <fungi@yuggoth.org> wrote:
For the teams tagged in the subject, please have a look at https://zuul.opendev.org/t/openstack/config-errors and merge fixes to your respective repositories for the errors listed there. A summary view can also be found by clicking the "bell" icon in the top-right corner of https://zuul.opendev.org/t/openstack/status or similar pages).
Many of these errors are new as of yesterday, due to lingering ansible_python_interpreter variable assignments left over from the Python 3.x default transition. Zuul no longer allows to override the value of this variable, but it can be safely removed since all cases seem to be setting it to the same as our current default.
Roughly half the errors look like they've been there for longer, and seem to relate to project renames or job removals leaving stale references in other projects. In most cases you should simply be able to update the project names in these or remove the associated jobs as they're likely no longer used. Also be aware that many of these errors are on stable branches, so the cleanup will need backporting in such cases.
Thanks for your prompt attention! -- Jeremy Stanley
participants (6)
-
Akihiro Motoki
-
Alex Schultz
-
Clark Boylan
-
Jeremy Stanley
-
Slawek Kaplonski
-
Takashi Kajinami