Wallaby on Ubuntu 20.04, Neutron 18.6.0 neutron-dhcp-agent RPC unusually slow

Zakhar Kirpichenko zakhar at gmail.com
Tue Mar 14 18:28:32 UTC 2023


If anyone is interested, I reported the bug/regression:
https://bugs.launchpad.net/cloud-archive/+bug/2011513

Is anyone else facing such issues?

/Z

On Tue, 14 Mar 2023 at 08:34, Zakhar Kirpichenko <zakhar at gmail.com> wrote:

> Hi!
>
> We're running Openstack Wallaby on Ubuntu 20.04, 3 high-performance infra
> nodes with a RabbitMQ cluster. I updated Neutron components to version
> 18.6.0, which recently became available in the cloud repository (
> http://ubuntu-cloud.archive.canonical.com/ubuntu focal-updates/wallaby
> main). The exact package versions updated are as follows:
>
> Install: libunbound8:amd64 (1.9.4-2ubuntu1.4, automatic),
> openvswitch-common:amd64 (2.15.2-0ubuntu1~cloud0, automatic)
> Upgrade: neutron-common:amd64 (2:18.5.0-0ubuntu1~cloud0,
> 2:18.6.0-0ubuntu1~cloud1), python3-werkzeug:amd64 (0.16.1+dfsg1-2,
> 0.16.1+dfsg1-2ubuntu0.1), neutron-dhcp-agent:amd64
> (2:18.5.0-0ubuntu1~cloud0, 2:18.6.0-0ubuntu1~cloud1),
> neutron-l3-agent:amd64 (2:18.5.0-0ubuntu1~cloud0,
> 2:18.6.0-0ubuntu1~cloud1), python3-neutron:amd64 (2:18.5.0-0ubuntu1~cloud0,
> 2:18.6.0-0ubuntu1~cloud1), neutron-server:amd64 (2:18.5.0-0ubuntu1~cloud0,
> 2:18.6.0-0ubuntu1~cloud1), neutron-plugin-ml2:amd64
> (2:18.5.0-0ubuntu1~cloud0, 2:18.6.0-0ubuntu1~cloud1),
> neutron-metadata-agent:amd64 (2:18.5.0-0ubuntu1~cloud0,
> 2:18.6.0-0ubuntu1~cloud1), neutron-linuxbridge-agent:amd64
> (2:18.5.0-0ubuntu1~cloud0, 2:18.6.0-0ubuntu1~cloud1)
>
> Installed Neutron packages:
>
> ii  neutron-common                        2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - common
> ii  neutron-dhcp-agent                    2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - DHCP agent
>  Firewall-as-a-Service driver for OpenStack Neutron
> ii  neutron-l3-agent                      2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - l3 agent
> ii  neutron-linuxbridge-agent             2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - linuxbridge agent
> ii  neutron-metadata-agent                2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - metadata agent
> ii  neutron-plugin-ml2                    2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - ML2 plugin
> ii  neutron-server                        2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - server
> ii  python3-neutron                       2:18.6.0-0ubuntu1~cloud1
>                     all          Neutron is a virtual network service for
> Openstack - Python library
> ii  python3-neutron-lib                   2.10.1-0ubuntu1~cloud0
>                     all          Neutron shared routines and utilities -
> Python 3.x
> ii  python3-neutronclient                 1:7.2.1-0ubuntu1~cloud0
>                      all          client API library for Neutron - Python
> 3.x
>
> Normally this would be an easy update, but this time neutron-dhcp-agent
> doesn't work properly:
>
> 2023-03-14 05:44:27.572 2534501 INFO neutron.agent.dhcp.agent
> [req-4a362701-cc1f-4b9d-87e6-045b6a388709 - - - - -] Synchronizing state
> complete
> 2023-03-14 05:44:38.868 2534501 ERROR neutron_lib.rpc
> [req-cb1dc604-1372-44cd-bc06-09496ed5f68f - - - - -] Timeout in RPC method
> dhcp_ready_on_ports. Waiting for 55 seconds before next attempt. If the
> server is not down, consider increasing the rpc_response_timeout option as
> Neutron server(s) may be overloaded and unable to respond quickly enough.:
> oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
> to message ID bd97110b004e413cb2d6b05d9fb3b57c
> 2023-03-14 05:44:38.871 2534501 WARNING neutron_lib.rpc
> [req-cb1dc604-1372-44cd-bc06-09496ed5f68f - - - - -] Increasing timeout for
> dhcp_ready_on_ports calls to 120 seconds. Restart the agent to restore it
> to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed
> out waiting for a reply to message ID bd97110b004e413cb2d6b05d9fb3b57c
> 2023-03-14 05:45:34.244 2534501 ERROR neutron.agent.dhcp.agent
> [req-cb1dc604-1372-44cd-bc06-09496ed5f68f - - - - -] Timeout notifying
> server of ports ready. Retrying...:
> oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
> to message ID bd97110b004e413cb2d6b05d9fb3b57c
> 2023-03-14 05:47:10.876 2534501 INFO oslo_messaging._drivers.amqpdriver
> [-] No calling threads waiting for msg_id : bd97110b004e413cb2d6b05d9fb3b57c
> 2023-03-14 05:47:34.353 2534501 ERROR neutron_lib.rpc
> [req-607a9252-49b1-4043-aa0d-2457b78dc99e - - - - -] Timeout in RPC method
> dhcp_ready_on_ports. Waiting for 27 seconds before next attempt. If the
> server is not down, consider increasing the rpc_response_timeout option as
> Neutron server(s) may be overloaded and unable to respond quickly enough.:
> oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
> to message ID f254f735998243c4b0a58ce95c974534
> 2023-03-14 05:47:34.354 2534501 WARNING neutron_lib.rpc
> [req-607a9252-49b1-4043-aa0d-2457b78dc99e - - - - -] Increasing timeout for
> dhcp_ready_on_ports calls to 240 seconds. Restart the agent to restore it
> to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed
> out waiting for a reply to message ID f254f735998243c4b0a58ce95c974534
> 2023-03-14 05:47:46.681 2534501 INFO oslo_messaging._drivers.amqpdriver
> [-] No calling threads waiting for msg_id : f254f735998243c4b0a58ce95c974534
> 2023-03-14 05:48:01.086 2534501 ERROR neutron.agent.dhcp.agent
> [req-607a9252-49b1-4043-aa0d-2457b78dc99e - - - - -] Timeout notifying
> server of ports ready. Retrying...:
> oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
> to message ID f254f735998243c4b0a58ce95c974534
> 2023-03-14 05:49:45.035 2534501 INFO neutron.agent.dhcp.agent
> [req-5935a0d0-a981-463c-a4ea-23ccbb54c896 - - - - -] DHCP configuration for
> ports ... (A successful configuration here).
>
> While neutron-dhcp-agent is waiting, neutron-server log gets filled up
> with:
>
> neutron-server.log:2023-03-14 05:47:05.761 4171971 INFO
> neutron.plugins.ml2.plugin [req-cb1dc604-1372-44cd-bc06-09496ed5f68f - - -
> - -] Attempt 1 to provision port 18cddbb8-f3ed-4b49-9c6f-c0c67b4f7c76
> ...
> neutron-server.log:2023-03-14 05:47:10.727 4171971 INFO
> neutron.plugins.ml2.plugin [req-cb1dc604-1372-44cd-bc06-09496ed5f68f - - -
> - -] Attempt 10 to provision port 18cddbb8-f3ed-4b49-9c6f-c0c67b4f7c76
>
> This repeats for each port of each network neutron-dhcp-agent needs to
> configure.
>
> Each subsequent configuration for each network takes about 1-2
> minutes, depending on the network size. With earlier Neutron versions the
> whole process of configuring all networks would finish in under a minute,
> i.e. DHCP configuration per port (and network) is several orders of
> magnitude slower than it should be. Once neutron-dhcp-agent finishes
> synchronization, it seems to work without issues although there aren't that
> many changes in our cloud to tell whether it's fast or slow, individual
> port updates seem to happen quickly.
>
> All other services are working well, RabbitMQ cluster is working well,
> infra nodes are not overloaded and there are no apparent issues other than
> this one with Neutron, thus I am inclined to think that the issue is
> specific to version 18.6.0 of neutron-dhcp-agent or neutron-server.
>
> I would appreciate any advice!
>
> Best regards,
> Zakhar
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230314/7e75c202/attachment-0001.htm>


More information about the openstack-discuss mailing list