[openstack-dev] [all] Update on Zuul v3 Migration - and what to do about issues

Monty Taylor mordred at inaugust.com
Fri Sep 29 14:58:57 UTC 2017

Hey everybody!

tl;dr - If you're having issues with your jobs, check the FAQ, this 
email and followups on this thread for mentions of them. If it's an 
issue with your job and you can spot it (bad config) just submit a patch 
with topic 'zuulv3'. If it's bigger/weirder/you don't know - we'd like 
to ask that you send a follow up email to this thread so that we can 
ensure we've got them all and so that others can see it too.

** Zuul v3 Migration Status **

If you haven't noticed the Zuul v3 migration - awesome, that means it's 
working perfectly for you.

If you have - sorry for the disruption. It turns out we have a REALLY 
complicated array of job content you've all created. Hopefully the pain 
of the moment will be offset by the ability for you to all take direct 
ownership of your awesome content... so bear with us, your patience is 

If you find yourself with some extra time on your hands while you wait 
on something, you may find it helpful to read:


We're adding content to it as issues arise. Unfortunately, one of the 
issues is that the infra manual publication job stopped working.

While the infra manual publication is being fixed, we're collecting FAQ 
content for it in an etherpad:


If you have a job issue, check it first to see if we've got an entry for 
it. Once manual publication is fixed, we'll update the etherpad to point 
to the FAQ section of the manual.

** Global Issues **

There are a number of outstanding issues that are being worked. As of 
right now, there are a few major/systemic ones that we're looking in to 
that are worth noting:

* Zuul Stalls

If you say to yourself "zuul doesn't seem to be doing anything, did I do 
something wrong?", we're having an issue that jeblair and Shrews are 
currently tracking down with intermittent connection issues in the 
backend plumbing.

When it happens it's an across the board issue, so fixing it is our 
number one priority.

* Incorrect node type

We've got reports of things running on trusty that should be running on 
xenial. The job definitions look correct, so this is also under 

* Multinode jobs having POST FAILURE

There is a bug in the log collection trying to collect from all nodes 
while the old jobs were designed to only collect from the 'primary'. 
Patches are up to fix this and should be fixed soon.

* Branch Exclusions being ignored

This has been reported and its cause is currently unknown.

Thank you all again for your patience! This is a giant rollout with a 
bunch of changes in it, so we really do appreciate everyone's 
understanding as we work through it all.


More information about the OpenStack-dev mailing list