Neutron issue || Remote error: TimeoutError QueuePool limit of size 5 overflow 50 reached

Lajos Katona katonalala at gmail.com
Thu Jan 27 08:42:25 UTC 2022


Hi,
Yes the default value for rpc_state_report_workers is 1.
There are some basic hint for these values in the config:

*api_workers*:
https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT.api_workers
: "Number of separate API worker processes for service. If not specified,
the *default is equal to the number of CPUs available for best performance,
capped by potential RAM usage.*"

*rpc_workers*:
https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT.rpc_workers
: "Number of RPC worker processes for service. If not specified, the *default
is equal to half the number of API workers.*"

I think this can be useful to start playing with these settings, and find
which combination is best for your environment.

Regards
Lajos

Md. Hejbul Tawhid MUNNA <munnaeebd at gmail.com> ezt írta (időpont: 2022. jan.
26., Sze, 6:56):

> Hello  Arnaud,
>
> Thank you so much for your valuable feedback.
>
> As per our neutron.conf file, #rpc_workers = 1 and #rpc_state_report_workers
> = 1 , is it default value neutron taken?
>
> What is the procedure to increase the value?  remove # and increase the
> worker from 1 to 2 and restart neutron-server service?
>
> Regards,
> Munna
>
>
>
> On Tue, Jan 25, 2022 at 6:26 PM Arnaud <arnaud.morin at gmail.com> wrote:
>
>> Hi,
>>
>> I am not an db expert, but openstack tends to open a LOT of connection to
>> the db. So one of the thing to monitor is the number of connection you
>> have/allow on the db side.
>>
>> Also, raising the number of RPC (and report states) workers will solve
>> your issue.
>> The good number is not easy to calculate, and depends on each deployment.
>>
>> A good approach is to the try/improve loop.
>>
>> Cheers,
>> Arnaud.
>>
>>
>> Le 25 janvier 2022 09:52:26 GMT+01:00, "Md. Hejbul Tawhid MUNNA" <
>> munnaeebd at gmail.com> a écrit :
>>>
>>> Hello  Arnaud,
>>>
>>> Thank you for your valuable reply.
>>>
>>> we did not modify default config of RPC worker .
>>>
>>> /etc/neutron/neutron.conf
>>>
>>> # Number of separate API worker processes for service. If not specified,
>>> the
>>> # default is equal to the number of CPUs available for best performance.
>>> # (integer value)
>>> #api_workers = <None>
>>>
>>> # Number of RPC worker processes for service. (integer value)
>>> #rpc_workers = 1
>>>
>>> # Number of RPC worker processes dedicated to state reports queue.
>>> (integer
>>> # value)
>>> #rpc_state_report_workers = 1
>>>
>>> how to check load on database. RAM/CPU/Disk-IO utilization is low on the
>>> database server.
>>>
>>> Please guide us further
>>>
>>> Regards,
>>> Munna
>>>
>>> On Mon, Jan 24, 2022 at 6:29 PM Arnaud <arnaud.morin at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I would also consider checking the number of RPC workers you have in
>>>> neutron.conf, this is maybe a better option to increase this before the
>>>> comnection pool params.
>>>>
>>>> Also, check your database, is it under load?
>>>> Updating agent state should not be long.
>>>>
>>>> Cheers,
>>>> Arnaud
>>>>
>>>>
>>>>
>>>> Le 24 janvier 2022 10:42:00 GMT+01:00, "Md. Hejbul Tawhid MUNNA" <
>>>> munnaeebd at gmail.com> a écrit :
>>>>>
>>>>> Hi,
>>>>>
>>>>> Currently we have running 500+VM and total network is 383 including
>>>>> HA-network.
>>>>>
>>>>> Can you advice the appropriate value and is there any chance of
>>>>> service impact?
>>>>>
>>>>> Should we change the configuration in the neutron.conf on controller
>>>>> node?
>>>>>
>>>>> Regards,
>>>>> Munna
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jan 24, 2022 at 2:47 PM Slawek Kaplonski <skaplons at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On poniedziałek, 24 stycznia 2022 09:09:10 CET Md. Hejbul Tawhid
>>>>>> MUNNA wrote:
>>>>>> > Hi,
>>>>>> >
>>>>>> > Suddenly we have observed few VM down . then we have found some
>>>>>> agent are
>>>>>> > getting down (XXX) , agents are getting UP and down randomly.
>>>>>> Please check
>>>>>> > the attachment.
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> /////////////////////////////////////////////////////////////////////////////
>>>>>> > /////////////////////// /sqlalchemy/pool.py", line 788, in
>>>>>> _checkout\n
>>>>>> > fairy =
>>>>>> > _ConnectionRecord.checkout(pool)\n', u'  File
>>>>>> > "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 532,
>>>>>> in
>>>>>> > checkout\n    rec = pool._do_get()\n', u'  File
>>>>>> > "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1186,
>>>>>> in
>>>>>> > _do_get\n    (self.size(), self.overflow(), self._timeout),
>>>>>> > code="3o7r")\n', u'TimeoutError: QueuePool limit of size 5 overflow
>>>>>> 50
>>>>>> > reached, connection timed out, timeout 30 (Background on this error
>>>>>> at:
>>>>>> > http://sqlalche.me/e/3o7r)\n'].
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>> Traceback (most
>>>>>> > recent call last):
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line
>>>>>> 837, in
>>>>>> > _report_state
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>>  True)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> > "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 97, in
>>>>>> > report_state
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>>  return
>>>>>> > method(context, 'report_state', **kwargs)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> > "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py",
>>>>>> line 179,
>>>>>> > in call
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>> > retry=self.retry)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> > "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py",
>>>>>> line 133,
>>>>>> > in _send
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>>  retry=retry)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> >
>>>>>> "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
>>>>>> > line 645, in send
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>> > call_monitor_timeout, retry=retry)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>>>>>> >
>>>>>> "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
>>>>>> > line 636, in _send
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>>  raise result
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>> RemoteError:
>>>>>> > Remote error: TimeoutError QueuePool limit of size 5 overflow 50
>>>>>> reached,
>>>>>> > connection timed out, timeout 30 (Background on this error at:
>>>>>> > http://sqlalche.me/e/3o7r)
>>>>>> > 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
>>>>>> [u'Traceback
>>>>>> > (most recent call last):\n', u'  File
>>>>>> > "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
>>>>>> line 163,
>>>>>> > in _process_incoming\n    res =
>>>>>> self.dispatcher.dispatch(message)\n', u'
>>>>>> >  File
>>>>>> "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
>>>>>> > line 265, in dispatch\n    return self._do_dispatch(endpoint,
>>>>>> method, ctxt,
>>>>>> > args)\n', u'  File
>>>>>> >
>>>>>> "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line
>>>>>> > 194, in _do_dispatch\n
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> /////////////////////////////////////////////////////////////////////////////
>>>>>> > ////
>>>>>> >
>>>>>> > Is there anything related with the following default configuration.
>>>>>> >
>>>>>> > /etc/neutron/neutron.conf
>>>>>> > #max_pool_size = 5
>>>>>> > #max_overflow = 50
>>>>>>
>>>>>> Yes. You probably have busy environment and You need to increase
>>>>>> those values
>>>>>> to have more connections from the neutron server to the database.
>>>>>>
>>>>>> >
>>>>>> > regards,
>>>>>> > Munna
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Slawek Kaplonski
>>>>>> Principal Software Engineer
>>>>>> Red Hat
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220127/027f8d22/attachment-0001.htm>


More information about the openstack-discuss mailing list