[openstack-dev] [gate][neutron][infra] tempest jobs timing out due to general sluggishness of the node?

Miguel Angel Ajo Pelayo majopela at redhat.com
Fri Feb 10 08:17:27 UTC 2017


I believe those are traces left by the reference implementation of cinder
setting very high debug level on tgtd. I'm not sure if that's related or
the culprit at all (probably the culprit is a mix of things).

I wonder if we could disable such verbosity on tgtd, which certainly is
going to slow down things.

On Fri, Feb 10, 2017 at 9:07 AM, Antonio Ojea <aojea at midokura.com> wrote:

> I guess it's an infra issue, specifically related to the storage, or the
> network that provide the storage.
>
> If you look at the syslog file [1] , there are a lot of this entries:
>
> Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_task_tx_start(2024) no more dataFeb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_task_tx_start(1996) found a task 71 131072 0 0Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_data_rsp_build(1136) 131072 131072 0 26214471Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: __cmd_done(1281) (nil) 0x2563000 0 131072
>
> grep tgtd syslog.txt.gz| wc
>   139602 1710808 15699432
>
> [1] http://logs.openstack.org/95/429095/2/check/gate-tempest-
> dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz
>
>
>
> On Fri, Feb 10, 2017 at 5:59 AM, Ihar Hrachyshka <ihrachys at redhat.com>
> wrote:
>
>> Hi all,
>>
>> I noticed lately a number of job failures in neutron gate that all
>> result in job timeouts. I describe
>> gate-tempest-dsvm-neutron-dvr-ubuntu-xenial job below, though I see
>> timeouts happening in other jobs too.
>>
>> The failure mode is all operations, ./stack.sh and each tempest test
>> take significantly more time (like 50% to 150% more, which results in
>> job timeout triggered). An example of what I mean can be found in [1].
>>
>> A good run usually takes ~20 minutes to stack up devstack; then ~40
>> minutes to pass full suite; a bad run usually takes ~30 minutes for
>> ./stack.sh; and then 1:20h+ until it is killed due to timeout.
>>
>> It affects different clouds (we see rax, internap, infracloud-vanilla,
>> ovh jobs affected; we haven't seen osic though). It can't be e.g. slow
>> pypi or apt mirrors because then we would see slowdown in ./stack.sh
>> phase only.
>>
>> We can't be sure that CPUs are the same, and devstack does not seem to
>> dump /proc/cpuinfo anywhere (in the end, it's all virtual, so not sure
>> if it would help anyway). Neither we have a way to learn whether
>> slowliness could be a result of adherence to RFC1149. ;)
>>
>> We discussed the matter in neutron channel [2] though couldn't figure
>> out the culprit, or where to go next. At this point we assume it's not
>> neutron's fault, and we hope others (infra?) may have suggestions on
>> where to look.
>>
>> [1] http://logs.openstack.org/95/429095/2/check/gate-tempest-dsv
>> m-neutron-dvr-ubuntu-xenial/35aa22f/console.html#_2017-02-
>> 09_04_47_12_874550
>> [2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/
>> %23openstack-neutron.2017-02-10.log.html#t2017-02-10T04:06:01
>>
>> Thanks,
>> Ihar
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170210/2b424ff4/attachment.html>


More information about the OpenStack-dev mailing list