[openstack-dev] [gate][neutron][infra] tempest jobs timing out due to general sluggishness of the node?

Attila Fazekas afazekas at redhat.com
Fri Feb 10 08:24:10 UTC 2017


I wonder, can we switch to CINDER_ISCSI_HELPER="lioadm"  ?

On Fri, Feb 10, 2017 at 9:17 AM, Miguel Angel Ajo Pelayo <
majopela at redhat.com> wrote:

> I believe those are traces left by the reference implementation of cinder
> setting very high debug level on tgtd. I'm not sure if that's related or
> the culprit at all (probably the culprit is a mix of things).
>
> I wonder if we could disable such verbosity on tgtd, which certainly is
> going to slow down things.
>
> On Fri, Feb 10, 2017 at 9:07 AM, Antonio Ojea <aojea at midokura.com> wrote:
>
>> I guess it's an infra issue, specifically related to the storage, or the
>> network that provide the storage.
>>
>> If you look at the syslog file [1] , there are a lot of this entries:
>>
>> Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_task_tx_start(2024) no more dataFeb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_task_tx_start(1996) found a task 71 131072 0 0Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: iscsi_data_rsp_build(1136) 131072 131072 0 26214471Feb 09 04:20:42 <http://logs.openstack.org/95/429095/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz#_Feb_09_04_20_42> ubuntu-xenial-rax-ord-7193667 tgtd[8542]: tgtd: __cmd_done(1281) (nil) 0x2563000 0 131072
>>
>> grep tgtd syslog.txt.gz| wc
>>   139602 1710808 15699432
>>
>> [1] http://logs.openstack.org/95/429095/2/check/gate-tempest-dsv
>> m-neutron-dvr-ubuntu-xenial/35aa22f/logs/syslog.txt.gz
>>
>>
>>
>> On Fri, Feb 10, 2017 at 5:59 AM, Ihar Hrachyshka <ihrachys at redhat.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I noticed lately a number of job failures in neutron gate that all
>>> result in job timeouts. I describe
>>> gate-tempest-dsvm-neutron-dvr-ubuntu-xenial job below, though I see
>>> timeouts happening in other jobs too.
>>>
>>> The failure mode is all operations, ./stack.sh and each tempest test
>>> take significantly more time (like 50% to 150% more, which results in
>>> job timeout triggered). An example of what I mean can be found in [1].
>>>
>>> A good run usually takes ~20 minutes to stack up devstack; then ~40
>>> minutes to pass full suite; a bad run usually takes ~30 minutes for
>>> ./stack.sh; and then 1:20h+ until it is killed due to timeout.
>>>
>>> It affects different clouds (we see rax, internap, infracloud-vanilla,
>>> ovh jobs affected; we haven't seen osic though). It can't be e.g. slow
>>> pypi or apt mirrors because then we would see slowdown in ./stack.sh
>>> phase only.
>>>
>>> We can't be sure that CPUs are the same, and devstack does not seem to
>>> dump /proc/cpuinfo anywhere (in the end, it's all virtual, so not sure
>>> if it would help anyway). Neither we have a way to learn whether
>>> slowliness could be a result of adherence to RFC1149. ;)
>>>
>>> We discussed the matter in neutron channel [2] though couldn't figure
>>> out the culprit, or where to go next. At this point we assume it's not
>>> neutron's fault, and we hope others (infra?) may have suggestions on
>>> where to look.
>>>
>>> [1] http://logs.openstack.org/95/429095/2/check/gate-tempest-dsv
>>> m-neutron-dvr-ubuntu-xenial/35aa22f/console.html#_2017-02-09
>>> _04_47_12_874550
>>> [2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/
>>> %23openstack-neutron.2017-02-10.log.html#t2017-02-10T04:06:01
>>>
>>> Thanks,
>>> Ihar
>>>
>>> ____________________________________________________________
>>> ______________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.op
>>> enstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170210/e8abe856/attachment.html>


More information about the OpenStack-dev mailing list