Re: Neutron issue || Remote error: TimeoutError QueuePool limit of size 5 overflow 50 reached

27 Jan 2022

      Hi,
Yes the default value for rpc_state_report_workers is 1.
There are some basic hint for these values in the config:

*api_workers*:
https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT...
: "Number of separate API worker processes for service. If not specified,
the *default is equal to the number of CPUs available for best performance,
capped by potential RAM usage.*"

*rpc_workers*:
https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT...
: "Number of RPC worker processes for service. If not specified, the *default
is equal to half the number of API workers.*"

I think this can be useful to start playing with these settings, and find
which combination is best for your environment.

Regards
Lajos

Md. Hejbul Tawhid MUNNA <munnaeebd@gmail.com> ezt írta (időpont: 2022. jan.
26., Sze, 6:56):
...
Hello  Arnaud,
Thank you so much for your valuable feedback.
As per our neutron.conf file, #rpc_workers = 1 and #rpc_state_report_workers
= 1 , is it default value neutron taken?
What is the procedure to increase the value?  remove # and increase the
worker from 1 to 2 and restart neutron-server service?
Regards,
Munna
On Tue, Jan 25, 2022 at 6:26 PM Arnaud <arnaud.morin@gmail.com> wrote:
...
Hi,
I am not an db expert, but openstack tends to open a LOT of connection to
the db. So one of the thing to monitor is the number of connection you
have/allow on the db side.
Also, raising the number of RPC (and report states) workers will solve
your issue.
The good number is not easy to calculate, and depends on each deployment.
A good approach is to the try/improve loop.
Cheers,
Arnaud.
Le 25 janvier 2022 09:52:26 GMT+01:00, "Md. Hejbul Tawhid MUNNA" <
munnaeebd@gmail.com> a écrit :
...
Hello  Arnaud,
Thank you for your valuable reply.
we did not modify default config of RPC worker .
/etc/neutron/neutron.conf
# Number of separate API worker processes for service. If not specified,
the
# default is equal to the number of CPUs available for best performance.
# (integer value)
#api_workers = <None>
# Number of RPC worker processes for service. (integer value)
#rpc_workers = 1
# Number of RPC worker processes dedicated to state reports queue.
(integer
# value)
#rpc_state_report_workers = 1
how to check load on database. RAM/CPU/Disk-IO utilization is low on the
database server.
Please guide us further
Regards,
Munna
On Mon, Jan 24, 2022 at 6:29 PM Arnaud <arnaud.morin@gmail.com> wrote:
...
Hi,
I would also consider checking the number of RPC workers you have in
neutron.conf, this is maybe a better option to increase this before the
comnection pool params.
Also, check your database, is it under load?
Updating agent state should not be long.
Cheers,
Arnaud
Le 24 janvier 2022 10:42:00 GMT+01:00, "Md. Hejbul Tawhid MUNNA" <
munnaeebd@gmail.com> a écrit :
...
Hi,
Currently we have running 500+VM and total network is 383 including
HA-network.
Can you advice the appropriate value and is there any chance of
service impact?
Should we change the configuration in the neutron.conf on controller
node?
Regards,
Munna
On Mon, Jan 24, 2022 at 2:47 PM Slawek Kaplonski <skaplons@redhat.com>
wrote:
...
Hi,
On poniedziałek, 24 stycznia 2022 09:09:10 CET Md. Hejbul Tawhid
MUNNA wrote:
> Hi,
>
> Suddenly we have observed few VM down . then we have found some
agent are
> getting down (XXX) , agents are getting UP and down randomly.
Please check
> the attachment.
>
>
/////////////////////////////////////////////////////////////////////////////
> /////////////////////// /sqlalchemy/pool.py", line 788, in
_checkout\n
> fairy =
> _ConnectionRecord.checkout(pool)\n', u'  File
> "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 532,
in
> checkout\n    rec = pool._do_get()\n', u'  File
> "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1186,
in
> _do_get\n    (self.size(), self.overflow(), self._timeout),
> code="3o7r")\n', u'TimeoutError: QueuePool limit of size 5 overflow
50
> reached, connection timed out, timeout 30 (Background on this error
at:
> http://sqlalche.me/e/3o7r)\n'].
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
Traceback (most
> recent call last):
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
> "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line
837, in
> _report_state
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
 True)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
> "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 97, in
> report_state
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
 return
> method(context, 'report_state', **kwargs)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
> "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py",
line 179,
> in call
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
> retry=self.retry)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
> "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py",
line 133,
> in _send
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
 retry=retry)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
> line 645, in send
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
> call_monitor_timeout, retry=retry)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent   File
>
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
> line 636, in _send
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
 raise result
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
RemoteError:
> Remote error: TimeoutError QueuePool limit of size 5 overflow 50
reached,
> connection timed out, timeout 30 (Background on this error at:
> http://sqlalche.me/e/3o7r)
> 2022-01-24 01:05:39.592 302841 ERROR neutron.agent.l3.agent
[u'Traceback
> (most recent call last):\n', u'  File
> "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
line 163,
> in _process_incoming\n    res =
self.dispatcher.dispatch(message)\n', u'
>  File
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
> line 265, in dispatch\n    return self._do_dispatch(endpoint,
method, ctxt,
> args)\n', u'  File
>
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line
> 194, in _do_dispatch\n
>
>
/////////////////////////////////////////////////////////////////////////////
> ////
>
> Is there anything related with the following default configuration.
>
> /etc/neutron/neutron.conf
> #max_pool_size = 5
> #max_overflow = 50
Yes. You probably have busy environment and You need to increase
those values
to have more connections from the neutron server to the database.
>
> regards,
> Munna
--
Slawek Kaplonski
Principal Software Engineer
Red Hat

Re: Neutron issue || Remote error: TimeoutError QueuePool limit of size 5 overflow 50 reached

Lajos Katona