[olso.messaging] RabbitMQ and streams
Hey all, Is there any chance that someone already worked on integrating RabbitMQ streams (see [1]) to replace fanouts in oslo rabbitmq driver? We are wondering if this would help lowering the rabbit usage, and maybe increase the rabbit stability? Our deployment is relying a lot on fanouts to delivers messages to computes (mostly neutron, e.g. secgroup update / remote cache population), this is leading to a massive number of messages beeing delivered by seconds. Cheers, Arnaud [1] https://www.rabbitmq.com/streams.html
Hey all, Following up this discussion, we implemented few things in oslo.messaging: - rely on streams instead of fanouts - switch ALL queues to quorum (no more transients/classic queues) - swtich to consistent queue naming (no more random uuid4 in queue naming) For the last point, we did this for two main purposes: - being able to easily identify a queue based on it's name (we added the compute hostname and processname in the queue name, as an operator, it allow us to quickly identify which server/service a queue belong) - re-use the queues after a service restart (we identified that high queue churn is problematic for rabbitmq) The results are pretty awesome! (pictures attached to the lp bug) - the CPU load reduced a lot (divided by 5) - the memory was also reduced (divided by 2) - the number of queue was divided by 2 (neutron is heavily relying on fanouts for remote cache population / we are using ovs based agents) However, the network traffic increased (multiply by 3), mostly due to the switch from classic queues (with few classic HA) to quorum (all HA). We pushed some of our patches to oslo.messaging repo here: https://review.opendev.org/q/topic:bug-2031497 all related to this bug: https://bugs.launchpad.net/oslo.messaging/+bug/2031497 Feel free to review/comment. The bonus of switching all queues to quorum (HA) is that we can now easily drain a rabbit node without affecting the openstack region (a.k.a rabbit is not SPOF anymore). Cheers, Arnaud, on behalf of OVHcloud team. On 23.07.23 - 13:03, Arnaud Morin wrote:
Hey all,
Is there any chance that someone already worked on integrating RabbitMQ streams (see [1]) to replace fanouts in oslo rabbitmq driver?
We are wondering if this would help lowering the rabbit usage, and maybe increase the rabbit stability?
Our deployment is relying a lot on fanouts to delivers messages to computes (mostly neutron, e.g. secgroup update / remote cache population), this is leading to a massive number of messages beeing delivered by seconds.
Cheers, Arnaud
Hey, That sounds really great. Thanks a lot for your work! Keeping fingers crossed that this will land to oslo.messaging soonish, as I am quite eager to get this in and start using :) ср, 16 авг. 2023 г. в 10:52, Arnaud Morin <arnaud.morin@gmail.com>:
Hey all,
Following up this discussion, we implemented few things in oslo.messaging: - rely on streams instead of fanouts - switch ALL queues to quorum (no more transients/classic queues) - swtich to consistent queue naming (no more random uuid4 in queue naming)
For the last point, we did this for two main purposes: - being able to easily identify a queue based on it's name (we added the compute hostname and processname in the queue name, as an operator, it allow us to quickly identify which server/service a queue belong) - re-use the queues after a service restart (we identified that high queue churn is problematic for rabbitmq)
The results are pretty awesome! (pictures attached to the lp bug) - the CPU load reduced a lot (divided by 5) - the memory was also reduced (divided by 2) - the number of queue was divided by 2 (neutron is heavily relying on fanouts for remote cache population / we are using ovs based agents)
However, the network traffic increased (multiply by 3), mostly due to the switch from classic queues (with few classic HA) to quorum (all HA).
We pushed some of our patches to oslo.messaging repo here: https://review.opendev.org/q/topic:bug-2031497 all related to this bug: https://bugs.launchpad.net/oslo.messaging/+bug/2031497
Feel free to review/comment.
The bonus of switching all queues to quorum (HA) is that we can now easily drain a rabbit node without affecting the openstack region (a.k.a rabbit is not SPOF anymore).
Cheers, Arnaud, on behalf of OVHcloud team.
On 23.07.23 - 13:03, Arnaud Morin wrote:
Hey all,
Is there any chance that someone already worked on integrating RabbitMQ streams (see [1]) to replace fanouts in oslo rabbitmq driver?
We are wondering if this would help lowering the rabbit usage, and maybe increase the rabbit stability?
Our deployment is relying a lot on fanouts to delivers messages to computes (mostly neutron, e.g. secgroup update / remote cache population), this is leading to a massive number of messages beeing delivered by seconds.
Cheers, Arnaud
participants (2)
-
Arnaud Morin
-
Dmitriy Rabotyagov