RE: Neutron RabbitMQ issues

20 Mar 2020

      When I say scale up I mean that yea. Plus of course bumping the number of workers (rpc_workers) to an appropriate value.

There are risks with modifying that. Might be worth asking in the neutron channel on irc, at least if this is your production deployment. If possible maybe test it in a lab or staging deployment first.

Best Regards, Erik Olof Gunnar Andersson

-----Original Message-----
From: Satish Patel <satish.txt@gmail.com> 
Sent: Friday, March 20, 2020 12:47 PM
To: Erik Olof Gunnar Andersson <eandersson@blizzard.com>
Cc: Grant Morley <grant@civo.com>; openstack-discuss@lists.openstack.org
Subject: Re: Neutron RabbitMQ issues

Do you think i should try to add following option by hand "new_facade = True"  on server and restart neutron-server services? I am not seeing any extra dependency for that option so looks very simple to add.

When you said scaled out number of workers means you added multiple neutron-servers on bunch of VM to spread load?  (I have 3x controller nodes running on physical servers)

On Fri, Mar 20, 2020 at 3:32 PM Erik Olof Gunnar Andersson <eandersson@blizzard.com> wrote:
...
This should just be for the server afaik. I haven't tried it out myself, but we for sure have the same issue. We just scaled out the number of workers as a workaround. In fact we even added neutron-servers on VMs to handle the issue.
Best Regards, Erik Olof Gunnar Andersson
-----Original Message-----
From: Satish Patel <satish.txt@gmail.com>
Sent: Friday, March 20, 2020 11:19 AM
To: Erik Olof Gunnar Andersson <eandersson@blizzard.com>
Cc: Grant Morley <grant@civo.com>; 
openstack-discuss@lists.openstack.org
Subject: Re: Neutron RabbitMQ issues
Erik,
That is good finding i have check on following file
/openstack/venvs/neutron-19.0.0.0rc3.dev6/lib/python2.7/site-packages/
neutron/objects/agent.py
Do you think i should add following option and restart neutron-server?
 is this for all compute nodes agent or just for server?
new_facade = True
On Fri, Mar 20, 2020 at 1:51 PM Erik Olof Gunnar Andersson <eandersson@blizzard.com> wrote:
...
Best Regards, Erik Olof Gunnar Andersson
From: Satish Patel <satish.txt@gmail.com>
Sent: Friday, March 20, 2020 9:23 AM
To: Grant Morley <grant@civo.com>
Cc: Erik Olof Gunnar Andersson <eandersson@blizzard.com>; 
openstack-discuss@lists.openstack.org
Subject: Re: Neutron RabbitMQ issues
Oh you are right here, i have following stuff in my neutron.conf on 
server
# Notifications
[oslo_messaging_notifications]
driver = messagingv2
topics = notifications
transport_url =
rabbit://neutron:5be2a043f9a93adbd@172.28.15.192:5671,neutron:5be2a0
43
f9a93adbd@172.28.15.248:5671,neutron:5be2a043f9a93adbd@172.28.15.22:
56
71//neutron?ssl=1
# Messaging
[oslo_messaging_rabbit]
rpc_conn_pool_size = 30
ssl = True
Following change i am going to made let me know if anything missing.
[DEFAULT]
executor_thread_pool_size = 2048   <--- is this correct? i didn't see anywhere "rpc_thread_pool_size"
rpc_response_timeout = 3600
[oslo_messaging_notifications]
topics = notifications
driver = noop
# Messaging
[oslo_messaging_rabbit]
rpc_conn_pool_size = 300
heartbeat_timeout_threshold = 0
ssl = True
Btw you might not necessarily be having RabbitMQ issues. You might also be experiencing something like this.
https://urldefense.com/v3/__https://bugs.launchpad.net/neutron/*bug/
18 
53071__;Kw!!Ci6f514n9QsL8ck!zYBaONMbYxgOLiv4UJY51DLOI4H-qHjCOACdH6in
bj
e694WzyxY-Eqpyl5QtywV6BQ$
Best Regards, Erik Andersson
Should i be adding this to all my compute nodes also ?
On Fri, Mar 20, 2020 at 11:40 AM Grant Morley <grant@civo.com> wrote:
If you tune rabbit then for:
heartbeat_timeout_threshold = 0
That should help with the error message you are getting.
That is a lot of messages queued. We had the same because we were not using ceilometer but had the "notifications" still turned on for it for services.
Are all of the ready messages for "notifications.info" for the 
various services ( Nova, Neutron, Keystone etc )
If that is the case you can disable those messages in your config files for those services. Look for:
# Notifications
[oslo_messaging_notifications]
notification_topics = notifications
driver = noop
Make sure the driver option is set to "noop" by default it will be set too "messagingv2" and then restart the service and that should stop sending messages to the queue. You can then purge the "notifications.info" queue for Nova or Neutron etc..
We only had the "message ready" when we had the setting for ceilometer set but was not using it. Also only purge a queue if it is for that reason. Do not purge the queue if it is for any other reason than that as it can cause issues.
Hope that