[openstack-dev] [tripleo] container jobs are unstable

Martin André m.andre at redhat.com
Thu Mar 23 15:24:27 UTC 2017


On Wed, Mar 22, 2017 at 2:20 PM, Dan Prince <dprince at redhat.com> wrote:
> On Wed, 2017-03-22 at 13:35 +0100, Flavio Percoco wrote:
>> On 22/03/17 13:32 +0100, Flavio Percoco wrote:
>> > On 21/03/17 23:15 -0400, Emilien Macchi wrote:
>> > > Hey,
>> > >
>> > > I've noticed that container jobs look pretty unstable lately; to
>> > > me,
>> > > it sounds like a timeout:
>> > > http://logs.openstack.org/19/447319/2/check-tripleo/gate-tripleo-
>> > > ci-centos-7-ovb-containers-oooq-nv/bca496a/console.html#_2017-03-
>> > > 22_00_08_55_358973
>> >
>> > There are different hypothesis on what is going on here. Some
>> > patches have
>> > landed to improve the write performance on containers by using
>> > hostpath mounts
>> > but we think the real slowness is coming from the images download.
>> >
>> > This said, this is still under investigation and the containers
>> > squad will
>> > report back as soon as there are new findings.
>>
>> Also, to be more precise, Martin André is looking into this. He also
>> fixed the
>> gate in the last 2 weeks.
>
> I spoke w/ Martin on IRC. He seems to think this is the cause of some
> of the failures:
>
> http://logs.openstack.org/32/446432/1/check-tripleo/gate-tripleo-ci-cen
> tos-7-ovb-containers-oooq-nv/543bc80/logs/oooq/overcloud-controller-
> 0/var/log/extra/docker/containers/heat_engine/log/heat/heat-
> engine.log.txt.gz#_2017-03-21_20_26_29_697
>
>
> Looks like Heat isn't able to create Nova instances in the overcloud
> due to "Host 'overcloud-novacompute-0' is not mapped to any cell'. This
> means our cells initialization code for containers may not be quite
> right... or there is a race somewhere.

Here are some findings. I've looked at time measures from CI for
https://review.openstack.org/#/c/448533/ which provided the most
recent results:

* gate-tripleo-ci-centos-7-ovb-ha [1]
    undercloud install: 23
    overcloud deploy: 72
    total time: 125
* gate-tripleo-ci-centos-7-ovb-nonha [2]
    undercloud install: 25
    overcloud deploy: 48
    total time: 122
* gate-tripleo-ci-centos-7-ovb-updates [3]
    undercloud install: 24
    overcloud deploy: 57
    total time: 152
* gate-tripleo-ci-centos-7-ovb-containers-oooq-nv [4]
    undercloud install: 28
    overcloud deploy: 48
    total time: 165 (timeout)

Looking at the undercloud & overcloud install times, the most task
consuming tasks, the containers job isn't doing that bad compared to
other OVB jobs. But looking closer I could see that:
- the containers job pulls docker images from dockerhub, this process
takes roughly 18 min.
- the overcloud validate task takes 10 min more than it should because
of the bug Dan mentioned (a fix is in the queue at
https://review.openstack.org/#/c/448575/)
- the postci takes a long time with quickstart, 13 min (4 min alone
spent on docker log collection) whereas it takes only 3 min when using
tripleo.sh

Adding all these numbers, we're at about 40 min of additional time for
oooq containers job which is enough to cross the CI job limit.

There is certainly a lot of room for optimization here and there and
I'll explore how we can speed up the containers CI job over the next
weeks.

Martin

[1] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d2c1b16/
[2] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/d6df760/
[3] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-updates/3b1f795/
[4] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-containers-oooq-nv/b816f20/

> Dan
>
>>
>> Flavio
>>
>>
>>
>> _____________________________________________________________________
>> _____
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubs
>> cribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list