[openstack-dev] [oslo][messaging] Further improvements and refactoring

Alexei Kornienko alexei.kornienko at gmail.com
Thu Jun 26 20:38:24 UTC 2014


Hello Jay,

Benchmark for oslo.messaging is really simple:
You create a client that sends messages infinitively and a server that 
processes them. After you can use rabbitmq management plugin to see 
average throughput in the queue.
Simple example can be found here - 
https://github.com/andreykurilin/profiler-for-oslo-messaging

I've mentioned some of this already in my previous mails but here it is 
again:

> "Huge" is descriptive but not quantitative :) Do you have any numbers 
> that pinpoint the amount of time that is being spent reconstructing 
> and declaring the queues, say, compared to the time spent doing 
> transmission? 
I don't have precise slowdown percentage for each issue. I've just 
identified hotspots using cProfile and strace.

> This ten times number... is that an estimate or do you have hard 
> numbers that show that? Just curious. 
Using a script that I've mentioned earlier I get average throughput on 
my PC ~700 cast calls per second.
I've written a simple and stupid script that uses kombu directly (in a 
single threaded and synchronous way with single connection) It gave me 
throughput ~8000 messages per second.
Thats why I say that library should work at least 10 times faster.

Regards,
Alexei Kornienko


On 06/26/2014 11:22 PM, Jay Pipes wrote:
> Hey Alexei, thanks for sharing your findings. Comments inline.
>
> On 06/26/2014 10:08 AM, Alexei Kornienko wrote:
>> Hello,
>>
>> Returning to performance issues of oslo.messaging.
>> I've found 2 biggest issue in existing implementation of rabbit (kombu)
>> driver:
>>
>> 1) For almost every message sent/received a new object of
>> Consumer/Publisher class is created. And each object of this class tries
>> to declare it's queue even if it's already declared.
>> https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_drivers/impl_rabbit.py#L159 
>>
>>
>> This causes a huge slowdown.
>
> "Huge" is descriptive but not quantitative :) Do you have any numbers 
> that pinpoint the amount of time that is being spent reconstructing 
> and declaring the queues, say, compared to the time spent doing 
> transmission?
>
>> 2) with issue #1 is fixed (I've applied a small hack to fix it in my
>> repo) the next big issue araise. For every rpc message received a reply
>> is sent when processing is done (it seems that reply is sent even for
>> "cast" calls which it really strange to me). Reply sent is using
>> connection pool to "speed up" replies. Due to bad implementation of
>> custom connection pool for every message sent underlying connection
>> channel is closed and reopened:
>> https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_drivers/impl_rabbit.py#L689 
>>
>>
>> Cause of this major issues oslo.messaging performance is at least 10
>> times slower than it could be.
>
> This ten times number... is that an estimate or do you have hard 
> numbers that show that? Just curious.
>
>> My opinion is that there is no simple and elegant fix for this issues in
>> current implementation of oslo.messaging (most because of bad
>> architecture of the library and processing flow). My proposal is that we
>> should start working on new major release of messaging library with
>> architectural issues fixed. This will allow us to avoid mentioned issues
>> and provide much more performance and flexibility for end users.
>> Main goal that we should achieve is separate rpc code from messaging
>> code this will allow us to implement both parts in much simpler and
>> cleaner way and in the same time it would be much faster.
>
> Perhaps actually a better starting point would be to create a 
> benchmarking harness that will allow us to see some baseline 
> throughput numbers that we can then compare to the iterative 
> improvements you will push?
>
> Best,
> -jay
>
>> Please share your thoughts on this topic.
>>
>> Regards,
>> Alexei Kornienko
>>
>> On 06/16/2014 02:47 PM, Gordon Sim wrote:
>>> On 06/13/2014 02:06 PM, Ihar Hrachyshka wrote:
>>>> On 10/06/14 15:40, Alexei Kornienko wrote:
>>>>> On 06/10/2014 03:59 PM, Gordon Sim wrote:
>>> >>>
>>>>>> I think there could be a lot of work required to significantly
>>>>>> improve that driver, and I wonder if that would be better spent
>>>>>> on e.g. the AMQP 1.0 driver which I believe will perform much
>>>>>> better and will offer more choice in deployment.
>>> >>
>>>>> I agree with you on this. However I'm not sure that we can do such
>>>>> a decision. If we focus on amqp driver only we should mention it
>>>>> explicitly and deprecate qpid driver completely. There is no point
>>>>> in keeping driver that is not really functional.
>>>>
>>>> The driver is functional. It may be not that efficient as
>>>> alternatives, but that's not a valid reason to deprecate it.
>>>
>>> The question in my view is what the plan is for ongoing development.
>>>
>>> Will the driver get better over time, or is it likely to remain as is
>>> at best (or even deteriorate)?
>>>
>>> Choice is good, but every choice adds to the maintenance burden, in
>>> testing against regressions if nothing else.
>>>
>>> I think an explicit decision about the future is beneficial, whatever
>>> the decision may be.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list