[openstack-dev] State of the Gate - Dec 12

Anita Kuno anteaya at anteaya.info
Thu Dec 12 14:39:10 UTC 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/12/2013 08:20 AM, Sean Dague wrote:
> Current Gate Length: 12hrs*, 41 deep
> 
> (top of gate entered 12hrs ago)
> 
> It's been an *exciting* week this week. For people not paying
> attention we had 2 external events which made things terrible
> earlier in the week.
> 
> ========================== Event 1: sphinx 1.2 complete breakage -
> MOSTLY RESOLVED ==========================
> 
> It turns out sphinx 1.2 + distutils (which pbr magic call through)
> means total sadness. The fix for this was a requirements pin to
> sphinx < 1.2, and until a project has taken that they will fail in
> the gate.
> 
> It also turns out that tox installs pre-released software by
> default (a terrible default behavior), so you also need a tox.ini
> change like this -
> https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise 
> local users will install things like sphinx 1.2b3. They will also
> break in other ways.
> 
> Not all projects have merged this. If you are a project that
> hasn't, please don't send any other jobs to the gate until you do.
> A lot of delay was added to the gate yesterday by Glance patches
> being pushed to the gate before their doc jobs were done.
> 
> ========================== Event 2: apt.puppetlabs.com outage -
> RESOLVED ==========================
> 
> We use that apt repository to setup the devstack nodes in nodepool
> with puppet. We were triggering an issue with grenade where it's
> apt-get calls were failing, because it does apt-get update once to
> make sure life is good. This only triggered in grenade (noth other
> devstack runs) because we do set -o errexit aggressively.
> 
> A fix in grenade to ignore these errors was merged yesterday
> afternoon (the purple line -
> http://status.openstack.org/elastic-recheck/ you can see where it
> showed up).
> 
> ========================== Top Gate Bugs 
> ==========================
> 
> We normally do this as a list, and you can see the whole list here
> - http://status.openstack.org/elastic-recheck/ (now sorted by
> number of FAILURES in the last 2 weeks)
> 
> That being said, our bigs race bug is currently this one bug - 
> https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want
> to merge patches, fixing that one bug will be huge.
> 
> Basically, you can't ssh into guests that get created. That's sort
> of a fundamental property of a cloud. It shows up more frequently
> on neutron jobs, possibly due to actually testing the metadata
> server path. There have been many attempts on retry logic on this,
> we actually retry for 196 seconds to get in and only fail once we
> can't get in, so waiting isn't helping. It doesn't seem like the
> env is under that much load.
> 
> Until we resolve this, life will not be good in landing patches.
> 
> -Sean
> 
> 
> 
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
Thanks Sean:

This is a terrific summary which really makes my task of confirming
and following up much more manageable.

Just by way of preempting the "its neutron's fault" pile-on, just in
case anyone is tempted, a few facts:

We were paying attention, as it happens to the sphinx pin. Patches to
neutron and neutronclient have merged:
http://git.openstack.org/cgit/openstack/neutron/tree/test-requirements.txt#n9
http://git.openstack.org/cgit/openstack/python-neutronclient/tree/test-requirements.txt#n9

The addition of the -U flag for pip install in tox.ini for
neutronclient https://review.openstack.org/#/c/60825/4 is in the check
queue, it tripped on the sphinx pin for neutronclient. Here is the one
for neutron: https://review.openstack.org/#/c/60825/4 I've just
alerted all neutron-core to refrain from +A'ing until these are merged.

We had been tracking 1253896 quite closely and I at least was working
wtih the belief we had done the work we needed to do for that bug.
Since it now comes to light that I am in error with regards to
neutron's responsibility to 1253896, I welcome all interested parties
to #openstack-neutron so that we can again work together to submit a
patch that addresses this issue.

Thanks Sean,
Anita.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSqcpnAAoJELmKyZugNFU0X40H/0GX6zpTEdyQJVgDLDmOlo7v
X5W3ioRfjQL28Rm+ISAoq/CTHuekggimhIsTz/TIJpYeS+657Bg+rPE2BE4S6Aag
IdyMOQcMVfjUxcit9UTssMueSsw7VcnLJSbi7hEcBAtIRkf6IA2gxZ/lvrx7VmD5
Odfg3mdUrnckVdv7Y6/7tWCMY+3sXuqLQwat3a83mP13jRjgbQw9QnVhib9yoHg3
+qJnDoH9BT3+PWig42u7893qaasCzqFiiyjlGnjg9YznrRRZvq0Szwqux/JgzWy4
ypmX5Xo4ueZGuLMpmb2Sb8RbE83q3u9nx15nTWFdC+IUxa12DnX1sid27YCDva4=
=oVW8
-----END PGP SIGNATURE-----



More information about the OpenStack-dev mailing list