[openstack-dev] RabbitMQ Scaling

Rosa, Andrea (HP Cloud Services) andrea.rosa at hp.com
Mon Nov 19 11:55:34 UTC 2012


Hi Ray

What about discussing your results with rabbimq team (through the rabbimtq mailing list)?
Maybe we are missing something.

Thanks
--
Andrea



From: Ray Pekowski [mailto:pekowski at gmail.com]
Sent: 17 November 2012 00:13
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] RabbitMQ Scaling

On Fri, Nov 16, 2012 at 3:41 AM, Rosa, Andrea (HP Cloud Services) <andrea.rosa at hp.com<mailto:andrea.rosa at hp.com>> wrote:
I am really surprised that you have the same results. The configuration with RAM nodes should have an impact on the performance.
I have some other questions:
- are you creating a new connection for each call or do you have a connectionPool?

I just turned on the OpenStack DEBUG level logging and what I see from my client RPC load generator is only one "Pool creating new connection" message, so it appears that it is using a connectionPool for all other requests.  Since I add load by starting a new client RPC load generator every 60 seconds, a new pool is created every 60 seconds.

- do you have multiple producers and a single consumer? I mean all RPC are requests are for a single queue? If so are you using a specific prefetch value? Are you using auto_ack?

Every 10 load generators go to a single consumer (simulated service).  Every group of 10 load generators goes to a different consumer (simulated service), but the performance really is bad from the very first load generator (when compared to RPC casts).  For example, the first load generator in the RPC call case only achieves 29 RPCs/sec while in the cast case, it achieves 240 RPCs/sec.

I don't know about prefetch or auto_ack.  Since I am using the OpenStack RPC abstraction it is whatever OpenStack is using for these settings.  It might not be significant, since I am seeing much better performance with casts than calls and the casts use static queues and exchanges, by static I mean they are created only once.

More evidence to this being due to serialized replication on the dynamically created queues and exchanges is the following table.  The numbers are for the time between the prior action and the start of the action listed and they are listed in real order of occurrence.  The "declares", bind and close (implies that close destroys the "auto delete" exchange and queue) are all significantly longer.  This is from a wireshark trace of a single producer and a single consumer from the RabbbitMQ server.  RabbitMQ is taking significantly longer to do something in the clustered case and it seems likely that that something is most likely the replication of the creation and destruction of the queue and exchange.

No cluster  Cluster of 3   AMQP Action
(in ms)     (in ms)
0.647       1.704          Type: Method (1) Channel: 1 Class: Channel (20) Method: Open (10)
0.451       0.468          Type: Method (1) Channel: 1 Class: Channel (20) Method: Open-Ok (11)
1.484       1.459          Type: Method (1) Channel: 1 Class: Exchange (40) Method: Declare (10) Exchange: XXXXXXXXXX Type: direct
0.431       2.612          Type: Method (1) Channel: 1 Class: Exchange (40) Method: Declare-Ok (11)
0.612       0.612          Type: Method (1) Channel: 1 Class: Queue (50) Method: Declare (10) Queue: XXXXXXXXXX
0.86        4.369          Type: Method (1) Channel: 1 Class: Queue (50) Method: Declare-Ok (11) Queue: XXXXXXXXXX
0.64        0.723          Type: Method (1) Channel: 1 Class: Queue (50) Method: Bind (20) Queue: XXXXXXXXXX Exchange: XXXXXXXXXX Routing-Key: XXXXXXXXXX
0.622       3.34          Type: Method (1) Channel: 1 Class: Queue (50) Method: Bind-Ok (21)
0.758       0.731          Type: Method (1) Channel: 1 Class: Exchange (40) Method: Declare (10) Exchange: nova Type: topic
0.193       0.194          Type: Method (1) Channel: 1 Class: Exchange (40) Method: Declare-Ok (11)
0.864       0.886          Type: Method (1) Channel: 1 Class: Basic (60) Method: Publish (40) Exchange: nova Routing-Key: perfsvc1
0.029       0.034          Type: Content header (2) Channel: 1
0.067       0.068          Type: Method (1) Channel: 1 Class: Basic (60) Method: Consume (20) Queue: XXXXXXXXXX
0.607       1.186          Type: Method (1) Channel: 1 Class: Basic (60) Method: Consume-Ok (21)
8.408       9.461          Type: Method (1) Channel: 1 Class: Channel (20) Method: Close (40)
1.8         10.47          Type: Method (1) Channel: 1 Class: Channel (20) Method: Close-Ok (41)

Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20121119/ee6a4eae/attachment.html>


More information about the OpenStack-dev mailing list