[openstack-dev] [Magnum] gate issues

Corey O'Brien coreypobrien at gmail.com
Fri Feb 12 13:01:06 UTC 2016


Hey all,

We've made some progress with the gates this past week. There are still
some issues, but I want to point out that I've also seen a lot of real
errors get a recheck comment recently. It slows the gate down and wastes
infra quota to recheck things that are going to fail again. Can I suggest
that we all make sure to get back in the habit of looking at failures and
noting down a reason for the recheck? This will also help track what issues
still remain to be fixed with the gates.

Thanks,
Corey

On Mon, Feb 8, 2016 at 12:10 PM Hongbin Lu <hongbin.lu at huawei.com> wrote:

> Hi Team,
>
>
>
> In order to resolve issue #3, it looks like we have to significantly
> reduce the memory consumption of the gate tests. Details can be found in
> this patch https://review.openstack.org/#/c/276958/ . For core team, a
> fast review and approval of that patch would be greatly appreciated, since
> it is hard to work with a gate that takes several hours to complete. Thanks.
>
>
>
> Best regards,
>
> Hongbin
>
>
>
> *From:* Corey O'Brien [mailto:coreypobrien at gmail.com]
> *Sent:* February-05-16 12:04 AM
>
>
> *To:* OpenStack Development Mailing List (not for usage questions)
>
> *Subject:* [openstack-dev] [Magnum] gate issues
>
>
>
> So as we're all aware, the gate is a mess right now. I wanted to sum up
> some of the issues so we can figure out solutions.
>
>
>
> 1. The functional-api job sometimes fails because bays timeout building
> after 1 hour. The logs look something like this:
>
> magnum.tests.functional.api.v1.test_bay.BayTest.test_create_list_and_delete_bays
> [3733.626171s] ... FAILED
>
> I can reproduce this hang on my devstack with etcdctl 2.0.10 as described
> in this bug (https://bugs.launchpad.net/magnum/+bug/1541105), but
> apparently either my fix with using 2.2.5 (
> https://review.openstack.org/#/c/275994/) is incomplete or there is
> another intermittent problem because it happened again even with that fix: (
> http://logs.openstack.org/94/275994/1/check/gate-functional-dsvm-magnum-api/32aacb1/console.html
> )
>
>
>
> 2. The k8s job has some sort of intermittent hang as well that causes a
> similar symptom as with swarm.
> https://bugs.launchpad.net/magnum/+bug/1541964
>
>
>
> 3. When the functional-api job runs, it frequently destroys the VM causing
> the jenkins slave agent to die. Example:
> http://logs.openstack.org/03/275003/6/check/gate-functional-dsvm-magnum-api/a9a0eb9//console.html
> <http://logs.openstack.org/03/275003/6/check/gate-functional-dsvm-magnum-api/a9a0eb9/console.html>
>
> When this happens, zuul re-queues a new build from the start on a new VM.
> This can happen many times in a row before the job completes.
>
> I chatted with openstack-infra about this and after taking a look at one
> of the VMs, it looks like memory over consumption leading to thrashing was
> a possible culprit. The sshd daemon was also dead but the console showed
> things like "INFO: task kswapd0:77 blocked for more than 120 seconds". A
> cursory glance and following some of the jobs seems to indicate that this
> doesn't happen on RAX VMs which have swap devices unlike the OVH VMs as
> well.
>
>
>
> 4. In general, even when things work, the gate is really slow. The
> sequential master-then-node build process in combination with underpowered
> VMs makes bay builds take 25-30 minutes when they do succeed. Since we're
> already close to tipping over a VM, we run functional tests with
> concurrency=1, so 2 bay builds means almost the entire allotted devstack
> testing time (generally 75 minutes of actual test time available it seems).
>
>
>
> Corey
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160212/dd591ac4/attachment.html>


More information about the OpenStack-dev mailing list