[openstack-dev] State of the Gate - Dec 12

Anita Kuno anteaya at anteaya.info
Fri Dec 13 03:50:52 UTC 2013

Hash: SHA1

On 12/12/2013 08:20 AM, Sean Dague wrote:
> Current Gate Length: 12hrs*, 41 deep
> (top of gate entered 12hrs ago)
> It's been an *exciting* week this week. For people not paying
> attention we had 2 external events which made things terrible
> earlier in the week.
> ========================== Event 1: sphinx 1.2 complete breakage -
> MOSTLY RESOLVED ==========================
> It turns out sphinx 1.2 + distutils (which pbr magic call through)
> means total sadness. The fix for this was a requirements pin to
> sphinx < 1.2, and until a project has taken that they will fail in
> the gate.
> It also turns out that tox installs pre-released software by
> default (a terrible default behavior), so you also need a tox.ini
> change like this -
> https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise 
> local users will install things like sphinx 1.2b3. They will also
> break in other ways.
> Not all projects have merged this. If you are a project that
> hasn't, please don't send any other jobs to the gate until you do.
> A lot of delay was added to the gate yesterday by Glance patches
> being pushed to the gate before their doc jobs were done.
> ========================== Event 2: apt.puppetlabs.com outage -
> RESOLVED ==========================
> We use that apt repository to setup the devstack nodes in nodepool
> with puppet. We were triggering an issue with grenade where it's
> apt-get calls were failing, because it does apt-get update once to
> make sure life is good. This only triggered in grenade (noth other
> devstack runs) because we do set -o errexit aggressively.
> A fix in grenade to ignore these errors was merged yesterday
> afternoon (the purple line -
> http://status.openstack.org/elastic-recheck/ you can see where it
> showed up).
> ========================== Top Gate Bugs 
> ==========================
> We normally do this as a list, and you can see the whole list here
> - http://status.openstack.org/elastic-recheck/ (now sorted by
> number of FAILURES in the last 2 weeks)
> That being said, our bigs race bug is currently this one bug - 
> https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want
> to merge patches, fixing that one bug will be huge.
We have been trying to make progress on this one all day.

Salvatore Orlando was able to dig a bit more before he had to sign off
for some sleep see comment #25.

Brent Eagles is working on this one, thanks for the reviews dkranz and
dims: https://review.openstack.org/#/c/59517/2
It isn't expected to entirely fix the bug but hopefully will reduce
some of its frequency.

Don Kehn is trying to work on 1253896 to see what he can see, he
hasn't looked at this one before so is just getting familiar with it.

I wish I had something better to report. I have to go to bed soon
myself so I thought I would share the status we have in the hopes that
those rising soon will read and carry on.

Thanks to everyone with your help addressing this,
> Basically, you can't ssh into guests that get created. That's sort
> of a fundamental property of a cloud. It shows up more frequently
> on neutron jobs, possibly due to actually testing the metadata
> server path. There have been many attempts on retry logic on this,
> we actually retry for 196 seconds to get in and only fail once we
> can't get in, so waiting isn't helping. It doesn't seem like the
> env is under that much load.
> Until we resolve this, life will not be good in landing patches.
> -Sean
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/


More information about the OpenStack-dev mailing list