RabbitMQ annoying disconnections

Adam Tomas bkslash at poczta.onet.pl
Fri Sep 17 12:00:58 UTC 2021


Hi,
after some struggling I have almost „clear” logs (clear=error free :) ) Almost…. RabbitMQ keeps disconnecting sessions and there is a huge amount of disconnect errors in all logs (see below). I found this bug description:

https://bugzilla.redhat.com/show_bug.cgi?id=1711794 <https://bugzilla.redhat.com/show_bug.cgi?id=1711794>

in which we can read as follows: "this is a recoverable issue that is already handled by how oslo.messaging is designed. disconnection is not an error and should not be reported as such in the logs.”

but… It is reported :( And produces tons of logs.

I tried to modify heartbeat values - helped a little bit, but I had to increase [database] max_pool_size = 5 and that of course multiplied number of disconnection errors by 5 :( 

[oslo_messaging_rabbit]
heartbeat_timeout_threshold = 720
heartbeat_interval = 360
heartbeat_rate = 4

How to avoid this disconnections?? Or what to do to avoid writing this „errors”to logs?

Best regards
Adam Tomas

/var/log/kolla/rabbitmq/rabbitXXX.log

2021-09-17 06:43:08.771 [error] <0.16056.678> closing AMQP connection <0.16056.678> (x.x.x.x00:44882 -> x.x.x.x00:5672 - mod_wsgi:959:aaa12af5-bb0b-456e-a129-a466eccc24f0):
missed heartbeats from client, timeout: 60s
2021-09-17 06:43:08.773 [info] <0.7412.679> Closing all channels from connection 'x.x.x.x00:44882 -> x.x.x.x00:5672' because it has been closed
2021-09-17 06:43:09.282 [info] <0.4574.679> accepting AMQP connection <0.4574.679> (x.x.x.x11:58170 -> x.x.x.x00:5672)
2021-09-17 06:43:09.283 [info] <0.28410.678> accepting AMQP connection <0.28410.678> (x.x.x.x11:58172 -> x.x.x.x00:5672)
2021-09-17 06:43:09.285 [info] <0.28410.678> Connection <0.28410.678> (x.x.x.x11:58172 -> x.x.x.x00:5672) has a client-provided name: magnum-conductor:954:c05a1221-6915-4414-ba93-7db5450ff394
2021-09-17 06:43:09.285 [info] <0.4574.679> Connection <0.4574.679> (x.x.x.x11:58170 -> x.x.x.x00:5672) has a client-provided name: magnum-conductor:967:08f29ce5-d6ee-43ae-87a9-e8c8856f8c93
2021-09-17 06:43:09.286 [info] <0.4574.679> connection <0.4574.679> (x.x.x.x11:58170 -> x.x.x.x00:5672 - magnum-conductor:967:08f29ce5-d6ee-43ae-87a9-e8c8856f8c93): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.286 [info] <0.28410.678> connection <0.28410.678> (x.x.x.x11:58172 -> x.x.x.x00:5672 - magnum-conductor:954:c05a1221-6915-4414-ba93-7db5450ff394): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.504 [error] <0.14465.678> closing AMQP connection <0.14465.678> (x.x.x.x00:44914 -> x.x.x.x00:5672 - mod_wsgi:964:ced74779-2794-40f6-a01e-276bceb5d7b3):
missed heartbeats from client, timeout: 60s
2021-09-17 06:43:09.507 [info] <0.14985.679> Closing all channels from connection 'x.x.x.x00:44914 -> x.x.x.x00:5672' because it has been closed
2021-09-17 06:43:09.521 [info] <0.11574.678> accepting AMQP connection <0.11574.678> (x.x.x.x00:46834 -> x.x.x.x00:5672)
2021-09-17 06:43:09.523 [info] <0.11574.678> Connection <0.11574.678> (x.x.x.x00:46834 -> x.x.x.x00:5672) has a client-provided name: mod_wsgi:964:ced74779-2794-40f6-a01e-276bceb5d7b3
2021-09-17 06:43:09.525 [info] <0.11574.678> connection <0.11574.678> (x.x.x.x00:46834 -> x.x.x.x00:5672 - mod_wsgi:964:ced74779-2794-40f6-a01e-276bceb5d7b3): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.654 [info] <0.31575.679> accepting AMQP connection <0.31575.679> (x.x.x.x01:55304 -> x.x.x.x00:5672)
2021-09-17 06:43:09.656 [info] <0.31575.679> Connection <0.31575.679> (x.x.x.x01:55304 -> x.x.x.x00:5672) has a client-provided name: magnum-conductor:7:5a1cc3cc-150d-417a-9652-1279597c3db7
2021-09-17 06:43:09.656 [info] <0.31180.678> accepting AMQP connection <0.31180.678> (x.x.x.x01:55306 -> x.x.x.x00:5672)
2021-09-17 06:43:09.658 [info] <0.31575.679> connection <0.31575.679> (x.x.x.x01:55304 -> x.x.x.x00:5672 - magnum-conductor:7:5a1cc3cc-150d-417a-9652-1279597c3db7): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.658 [info] <0.31180.678> Connection <0.31180.678> (x.x.x.x01:55306 -> x.x.x.x00:5672) has a client-provided name: magnum-conductor:954:c8deffa4-7d67-46be-9b9d-09fa20544a2b
2021-09-17 06:43:09.660 [info] <0.2757.679> accepting AMQP connection <0.2757.679> (x.x.x.x11:58174 -> x.x.x.x00:5672)
2021-09-17 06:43:09.660 [info] <0.31180.678> connection <0.31180.678> (x.x.x.x01:55306 -> x.x.x.x00:5672 - magnum-conductor:954:c8deffa4-7d67-46be-9b9d-09fa20544a2b): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.662 [info] <0.2757.679> Connection <0.2757.679> (x.x.x.x11:58174 -> x.x.x.x00:5672) has a client-provided name: magnum-conductor:7:cebf81c9-95f6-48fa-8bde-30c01243d315
2021-09-17 06:43:09.663 [info] <0.2757.679> connection <0.2757.679> (x.x.x.x11:58174 -> x.x.x.x00:5672 - magnum-conductor:7:cebf81c9-95f6-48fa-8bde-30c01243d315): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:09.814 [error] <0.22797.678> closing AMQP connection <0.22797.678> (x.x.x.x05:57676 -> x.x.x.x00:5672 - neutron-openvswitch-agent:7:78d30277-173a-47d4-b8a8-3edde75f115e):
missed heartbeats from client, timeout: 60s
2021-09-17 06:43:09.817 [info] <0.7903.679> Closing all channels from connection 'x.x.x.x05:57676 -> x.x.x.x00:5672' because it has been closed
2021-09-17 06:43:10.319 [error] <0.18268.678> closing AMQP connection <0.18268.678> (x.x.x.x00:44924 -> x.x.x.x00:5672 - nova-conductor:957:e3951cbd-37ca-4d9f-8fac-394c83b36f09):
missed heartbeats from client, timeout: 60s
2021-09-17 06:43:10.322 [info] <0.28032.678> Closing all channels from connection 'x.x.x.x00:44924 -> x.x.x.x00:5672' because it has been closed
2021-09-17 06:43:10.330 [info] <0.19596.678> accepting AMQP connection <0.19596.678> (x.x.x.x00:46850 -> x.x.x.x00:5672)
2021-09-17 06:43:10.333 [info] <0.19596.678> Connection <0.19596.678> (x.x.x.x00:46850 -> x.x.x.x00:5672) has a client-provided name: nova-conductor:957:e3951cbd-37ca-4d9f-8fac-394c83b36f09
2021-09-17 06:43:10.334 [info] <0.19596.678> connection <0.19596.678> (x.x.x.x00:46850 -> x.x.x.x00:5672 - nova-conductor:957:e3951cbd-37ca-4d9f-8fac-394c83b36f09): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:10.831 [info] <0.9838.678> accepting AMQP connection <0.9838.678> (x.x.x.x05:57878 -> x.x.x.x00:5672)
2021-09-17 06:43:10.834 [info] <0.9838.678> Connection <0.9838.678> (x.x.x.x05:57878 -> x.x.x.x00:5672) has a client-provided name: neutron-openvswitch-agent:7:78d30277-173a-47d4-b8a8-3edde75f115e
2021-09-17 06:43:10.835 [info] <0.9838.678> connection <0.9838.678> (x.x.x.x05:57878 -> x.x.x.x00:5672 - neutron-openvswitch-agent:7:78d30277-173a-47d4-b8a8-3edde75f115e): user 'openstack' authenticated and granted access to vhost '/'
2021-09-17 06:43:11.738 [error] <0.15275.678> closing AMQP connection <0.15275.678> (x.x.x.x01:53414 -> x.x.x.x00:5672 - mod_wsgi:968:b60f723c-882e-4780-a162-55c03ebe6179):
missed heartbeats from client, timeout: 60s


/var/log/kolla/glance/glance-api.log
2021-09-17 13:17:23.212 958 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:17:29.208 955 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:17:42.190 957 INFO eventlet.wsgi.server [-] x.x.x.x00 - - [17/Sep/2021 13:17:42] "GET / HTTP/1.1" 300 1446 0.001949
2021-09-17 13:17:46.351 957 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:18:12.336 955 INFO eventlet.wsgi.server [-] x.x.x.x00 - - [17/Sep/2021 13:18:12] "GET / HTTP/1.1" 300 1446 0.003754
2021-09-17 13:18:13.703 956 INFO eventlet.wsgi.server [req-1a00db34-349b-4ed3-8625-edfadc939be4 d33b56f82a984f6a99bad58717f6f0ef 6ee7e3517d254f9e9ee5eac7d56a3179 - default default] x.x.x.x09,x.x.x.x04 - - [17/Sep/2021 13:18:13] "GET /v2/images?limit=20 HTTP/1.1" 200 2990 0.139229
2021-09-17 13:18:33.065 955 INFO eventlet.wsgi.server [req-a57046eb-549e-4837-95c7-d0994c05449e d33b56f82a984f6a99bad58717f6f0ef 6ee7e3517d254f9e9ee5eac7d56a3179 - default default] x.x.x.x09,x.x.x.x04 - - [17/Sep/2021 13:18:33] "GET /v2/images?limit=20 HTTP/1.1" 200 2990 0.096302
2021-09-17 13:18:33.077 955 INFO eventlet.wsgi.server [req-3a8af281-6732-4afd-b3ec-e502367bce51 d33b56f82a984f6a99bad58717f6f0ef 6ee7e3517d254f9e9ee5eac7d56a3179 - default default] x.x.x.x09,x.x.x.x04 - - [17/Sep/2021 13:18:33] "GET /v2/schemas/image HTTP/1.1" 200 6259 0.003982
2021-09-17 13:18:36.747 959 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:18:36.759 959 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:18:42.460 957 INFO eventlet.wsgi.server [-] x.x.x.x00 - - [17/Sep/2021 13:18:42] "GET / HTTP/1.1" 300 1446 0.001383
2021-09-17 13:18:52.659 956 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:18:52.670 956 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:19:12.598 955 INFO eventlet.wsgi.server [-] x.x.x.x00 - - [17/Sep/2021 13:19:12] "GET / HTTP/1.1" 300 1446 0.001590
2021-09-17 13:19:16.367 957 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:19:27.127 955 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:19:27.184 955 INFO eventlet.wsgi.server [req-8f68efc9-f46a-450d-8138-928e4f2585b4 d33b56f82a984f6a99bad58717f6f0ef 6ee7e3517d254f9e9ee5eac7d56a3179 - default default] x.x.x.x09,x.x.x.x04 - - [17/Sep/2021 13:19:27] "GET /v2/images?limit=20 HTTP/1.1" 200 2990 0.032917



/var/log/kolla/keystone/keystone.log
2021-09-17 13:13:54.670 958 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:13:55.148 960 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:13:55.150 962 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:13:55.345 959 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:13:56.044 961 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:18.402 966 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:31.372 963 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:36.580 967 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:38.771 965 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:16:12.226 964 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer


/var/log/kolla/nova/nova-compute.log
2021-09-17 13:10:07.710 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 56226.
2021-09-17 13:11:11.473 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:11:34.046 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:11:34.721 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:13:07.699 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:13:08.734 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 56434.
2021-09-17 13:14:11.501 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:34.070 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:14:34.745 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:16:08.724 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:16:09.756 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 56638.
2021-09-17 13:17:34.098 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:19:09.747 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:19:10.781 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 56834.
2021-09-17 13:20:34.124 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:20:34.777 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:20:51.789 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:22:10.772 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:22:11.806 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 57038.
2021-09-17 13:23:34.149 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:23:34.803 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:23:51.818 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:25:11.795 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:25:12.842 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 57240.
2021-09-17 13:26:34.176 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:26:34.830 7 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2021-09-17 13:28:12.812 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] AMQP server on x.x.x.x00:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2021-09-17 13:28:13.848 7 INFO oslo.messaging._drivers.impl_rabbit [-] [9d51e0ec-bd43-46b4-8c57-a9c6aa01903d] Reconnected to AMQP server on x.x.x.x00:5672 via [amqp] client with port 57446.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210917/77b181cd/attachment-0001.htm>


More information about the openstack-discuss mailing list