[openstack-dev] State of the Gate - Dec 12

Joe Gordon joe.gordon0 at gmail.com
Thu Dec 12 18:54:33 UTC 2013


On Thu, Dec 12, 2013 at 7:19 PM, Matt Riedemann
<mriedem at linux.vnet.ibm.com>wrote:

>
>
> On 12/12/2013 7:20 AM, Sean Dague wrote:
>
>> Current Gate Length: 12hrs*, 41 deep
>>
>> (top of gate entered 12hrs ago)
>>
>> It's been an *exciting* week this week. For people not paying attention
>> we had 2 external events which made things terrible earlier in the week.
>>
>> ==========================
>> Event 1: sphinx 1.2 complete breakage - MOSTLY RESOLVED
>> ==========================
>>
>> It turns out sphinx 1.2 + distutils (which pbr magic call through) means
>> total sadness. The fix for this was a requirements pin to sphinx < 1.2,
>> and until a project has taken that they will fail in the gate.
>>
>> It also turns out that tox installs pre-released software by default (a
>> terrible default behavior), so you also need a tox.ini change like this
>> - https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise
>> local users will install things like sphinx 1.2b3. They will also break
>> in other ways.
>>
>> Not all projects have merged this. If you are a project that hasn't,
>> please don't send any other jobs to the gate until you do. A lot of
>> delay was added to the gate yesterday by Glance patches being pushed to
>> the gate before their doc jobs were done.
>>
>> ==========================
>> Event 2: apt.puppetlabs.com outage - RESOLVED
>> ==========================
>>
>> We use that apt repository to setup the devstack nodes in nodepool with
>> puppet. We were triggering an issue with grenade where it's apt-get
>> calls were failing, because it does apt-get update once to make sure
>> life is good. This only triggered in grenade (noth other devstack runs)
>> because we do set -o errexit aggressively.
>>
>> A fix in grenade to ignore these errors was merged yesterday afternoon
>> (the purple line - http://status.openstack.org/elastic-recheck/ you can
>> see where it showed up).
>>
>> ==========================
>> Top Gate Bugs
>> ==========================
>>
>> We normally do this as a list, and you can see the whole list here -
>> http://status.openstack.org/elastic-recheck/ (now sorted by number of
>> FAILURES in the last 2 weeks)
>>
>> That being said, our bigs race bug is currently this one bug -
>> https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want to
>> merge patches, fixing that one bug will be huge.
>>
>> Basically, you can't ssh into guests that get created. That's sort of a
>> fundamental property of a cloud. It shows up more frequently on neutron
>> jobs, possibly due to actually testing the metadata server path. There
>> have been many attempts on retry logic on this, we actually retry for
>> 196 seconds to get in and only fail once we can't get in, so waiting
>> isn't helping. It doesn't seem like the env is under that much load.
>>
>> Until we resolve this, life will not be good in landing patches.
>>
>>         -Sean
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
> There have been a few threads [1][2] on gate failures and the process
> around what happens when we go about identifying, tracking and fixing them.
>
> I couldn't find anything outside of the mailing list to keep a record of
> this so started a page here [3].
>

Thanks for writing down all that content! but we are trying to keep all the
elastic-recheck documentation in elastic-recheck.  So a patch to
elastic-recheck docs would be very welcome.

https://review.openstack.org/#/c/61300/1
https://review.openstack.org/#/c/61298/


The one big thing that wiki doesn't mention, is one of the most important
parts, actually fixing the bugs from
http://status.openstack.org/elastic-recheck/.


>
> Feel free to contribute so we can point people to how they can easily help
> in working these faster.
>
> [1] http://lists.openstack.org/pipermail/openstack-dev/2013-
> November/020280.html
> [2] http://lists.openstack.org/pipermail/openstack-dev/2013-
> November/019931.html
> [3] https://wiki.openstack.org/wiki/ElasticRecheck
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131212/bfe56af0/attachment.html>


More information about the OpenStack-dev mailing list