[openstack-dev] [all] Update on Zuul v3 Migration - and what to do about issues
Monty Taylor
mordred at inaugust.com
Fri Sep 29 14:58:57 UTC 2017
Hey everybody!
tl;dr - If you're having issues with your jobs, check the FAQ, this
email and followups on this thread for mentions of them. If it's an
issue with your job and you can spot it (bad config) just submit a patch
with topic 'zuulv3'. If it's bigger/weirder/you don't know - we'd like
to ask that you send a follow up email to this thread so that we can
ensure we've got them all and so that others can see it too.
** Zuul v3 Migration Status **
If you haven't noticed the Zuul v3 migration - awesome, that means it's
working perfectly for you.
If you have - sorry for the disruption. It turns out we have a REALLY
complicated array of job content you've all created. Hopefully the pain
of the moment will be offset by the ability for you to all take direct
ownership of your awesome content... so bear with us, your patience is
appreciated.
If you find yourself with some extra time on your hands while you wait
on something, you may find it helpful to read:
https://docs.openstack.org/infra/manual/zuulv3.html
We're adding content to it as issues arise. Unfortunately, one of the
issues is that the infra manual publication job stopped working.
While the infra manual publication is being fixed, we're collecting FAQ
content for it in an etherpad:
https://etherpad.openstack.org/p/zuulv3-migration-faq
If you have a job issue, check it first to see if we've got an entry for
it. Once manual publication is fixed, we'll update the etherpad to point
to the FAQ section of the manual.
** Global Issues **
There are a number of outstanding issues that are being worked. As of
right now, there are a few major/systemic ones that we're looking in to
that are worth noting:
* Zuul Stalls
If you say to yourself "zuul doesn't seem to be doing anything, did I do
something wrong?", we're having an issue that jeblair and Shrews are
currently tracking down with intermittent connection issues in the
backend plumbing.
When it happens it's an across the board issue, so fixing it is our
number one priority.
* Incorrect node type
We've got reports of things running on trusty that should be running on
xenial. The job definitions look correct, so this is also under
investigation.
* Multinode jobs having POST FAILURE
There is a bug in the log collection trying to collect from all nodes
while the old jobs were designed to only collect from the 'primary'.
Patches are up to fix this and should be fixed soon.
* Branch Exclusions being ignored
This has been reported and its cause is currently unknown.
Thank you all again for your patience! This is a giant rollout with a
bunch of changes in it, so we really do appreciate everyone's
understanding as we work through it all.
Monty
More information about the OpenStack-dev
mailing list