[openstack-dev] [oslo][messaging] Further improvements and refactoring

Ihar Hrachyshka ihrachys at redhat.com
Mon Jun 30 11:22:34 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

> On 06/27/2014 04:04 PM, Ihar Hrachyshka wrote: On 26/06/14 22:38,
> Alexei Kornienko wrote:
>>>> Hello Jay,
>>>> 
>>>> Benchmark for oslo.messaging is really simple: You create a
>>>> client that sends messages infinitively and a server that
>>>> processes them. After you can use rabbitmq management plugin
>>>> to see average throughput in the queue. Simple example can be
>>>> found here - 
>>>> https://github.com/andreykurilin/profiler-for-oslo-messaging
>>>> 
>>>> I've mentioned some of this already in my previous mails but
>>>> here it is again:
>>>> 
>>>>> "Huge" is descriptive but not quantitative :) Do you have
>>>>> any numbers that pinpoint the amount of time that is being
>>>>> spent reconstructing and declaring the queues, say,
>>>>> compared to the time spent doing transmission?
>>>> I don't have precise slowdown percentage for each issue. I've
>>>> just identified hotspots using cProfile and strace.
>>>> 
>>>>> This ten times number... is that an estimate or do you have
>>>>> hard numbers that show that? Just curious.
>>>> Using a script that I've mentioned earlier I get average
>>>> throughput on my PC ~700 cast calls per second. I've written
>>>> a simple and stupid script that uses kombu directly (in a
>>>> single threaded and synchronous way with single connection)
>>>> It gave me throughput ~8000 messages per second. Thats why I
>>>> say that library should work at least 10 times faster.
> It doesn't show that those major issues you've pointed out result
> in such large message processing speed dropdown though. Maybe there
> are other causes of slowdown we observe. Neither it shows that
> refactoring of the code will actually help to boost the library
> significantly.
>> It doesn't show that those major issues are the *only* reason for
>> slowdown. Bit it shows that those issues are biggest that are
>> currently visible.

My understanding is that your analysis is mostly based on running a
profiler against the code. Network operations can be bottlenecked in
other places.

You compare 'simple script using kombu' with 'script using
oslo.messaging'. You don't compare script using oslo.messaging before
refactoring and 'after that. The latter would show whether refactoring
was worth the effort. Your test shows that oslo.messaging performance
sucks, but it's not definite that hotspots you've revealed, once
fixed, will show huge boost.

My concern is that it may turn out that once all the effort to
refactor the code is done, we won't see major difference. So we need
base numbers, and performance tests would be a great helper here.

>> We'll get a major speed boost if we fix them (and possibly
>> discover new issues the would prevent us of reaching full
>> speed).
> 
> Though having some hard numbers from using kombu directly is still
> a good thing. Is it possible that we introduce some performance
> tests into oslo.messaging itself?
>> Some performance tests may be introduced but they would be more
>> like functional tests since they require setup of actual
>> messaging server (rabbit, etc.).

Yes. I think we already have some. F.e.
tests/drivers/test_impl_qpid.py attempts to use local Qpid server
(backing up to fake server if it's not available). We could create a
separate subtree in the library for functional tests.

>> What do you mean exactly by performance tests? Just testing
>> overall throughput with some basic scenarios or you mean finding
>> hotspots in the code?
> 

The former. Once we have some base numbers, we may use them to
consider whether changes you propose are worth the effort.

> 
>>>> Regards, Alexei Kornienko
>>>> 
>>>> 
>>>> On 06/26/2014 11:22 PM, Jay Pipes wrote:
>>>>> Hey Alexei, thanks for sharing your findings. Comments
>>>>> inline.
>>>>> 
>>>>> On 06/26/2014 10:08 AM, Alexei Kornienko wrote:
>>>>>> Hello,
>>>>>> 
>>>>>> Returning to performance issues of oslo.messaging. I've
>>>>>> found 2 biggest issue in existing implementation of
>>>>>> rabbit (kombu) driver:
>>>>>> 
>>>>>> 1) For almost every message sent/received a new object
>>>>>> of Consumer/Publisher class is created. And each object
>>>>>> of this class tries to declare it's queue even if it's
>>>>>> already declared. 
>>>>>> https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_drivers/impl_rabbit.py#L159
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>
>>>>>> 
This causes a huge slowdown.
>>>>> "Huge" is descriptive but not quantitative :) Do you have
>>>>> any numbers that pinpoint the amount of time that is being
>>>>> spent reconstructing and declaring the queues, say,
>>>>> compared to the time spent doing transmission?
>>>>> 
>>>>>> 2) with issue #1 is fixed (I've applied a small hack to
>>>>>> fix it in my repo) the next big issue araise. For every
>>>>>> rpc message received a reply is sent when processing is
>>>>>> done (it seems that reply is sent even for "cast" calls
>>>>>> which it really strange to me). Reply sent is using
>>>>>> connection pool to "speed up" replies. Due to bad
>>>>>> implementation of custom connection pool for every 
>>>>>> message sent underlying connection channel is closed and 
>>>>>> reopened: 
>>>>>> https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_drivers/impl_rabbit.py#L689
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>
>>>>>> 
Cause of this major issues oslo.messaging performance is at least 10
>>>>>> times slower than it could be.
>>>>> This ten times number... is that an estimate or do you have
>>>>> hard numbers that show that? Just curious.
>>>>> 
>>>>>> My opinion is that there is no simple and elegant fix for
>>>>>> this issues in current implementation of oslo.messaging
>>>>>> (most because of bad architecture of the library and
>>>>>> processing flow). My proposal is that we should start
>>>>>> working on new major release of messaging library with
>>>>>> architectural issues fixed. This will allow us to avoid
>>>>>> mentioned issues and provide much more performance and
>>>>>> flexibility for end users. Main goal that we should
>>>>>> achieve is separate rpc code from messaging code this 
>>>>>> will allow us to implement both parts in much simpler
>>>>>> and cleaner way and in the same time it would be much
>>>>>> faster.
>>>>> Perhaps actually a better starting point would be to create
>>>>> a benchmarking harness that will allow us to see some
>>>>> baseline throughput numbers that we can then compare to the
>>>>> iterative improvements you will push?
>>>>> 
>>>>> Best, -jay
>>>>> 
>>>>>> Please share your thoughts on this topic.
>>>>>> 
>>>>>> Regards, Alexei Kornienko
>>>>>> 
>>>>>> On 06/16/2014 02:47 PM, Gordon Sim wrote:
>>>>>>> On 06/13/2014 02:06 PM, Ihar Hrachyshka wrote:
>>>>>>>> On 10/06/14 15:40, Alexei Kornienko wrote:
>>>>>>>>> On 06/10/2014 03:59 PM, Gordon Sim wrote:
>>>>>>>>>> I think there could be a lot of work required to 
>>>>>>>>>> significantly improve that driver, and I wonder
>>>>>>>>>> if that would be better spent on e.g. the AMQP
>>>>>>>>>> 1.0 driver which I believe will perform much
>>>>>>>>>> better and will offer more choice in deployment.
>>>>>>>>> I agree with you on this. However I'm not sure that
>>>>>>>>> we can do such a decision. If we focus on amqp
>>>>>>>>> driver only we should mention it explicitly and
>>>>>>>>> deprecate qpid driver completely. There is no point
>>>>>>>>> in keeping driver that is not really functional.
>>>>>>>> The driver is functional. It may be not that
>>>>>>>> efficient as alternatives, but that's not a valid
>>>>>>>> reason to deprecate it.
>>>>>>> The question in my view is what the plan is for
>>>>>>> ongoing development.
>>>>>>> 
>>>>>>> Will the driver get better over time, or is it likely
>>>>>>> to remain as is at best (or even deteriorate)?
>>>>>>> 
>>>>>>> Choice is good, but every choice adds to the
>>>>>>> maintenance burden, in testing against regressions if
>>>>>>> nothing else.
>>>>>>> 
>>>>>>> I think an explicit decision about the future is
>>>>>>> beneficial, whatever the decision may be.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> OpenStack-dev mailing list
>>>>>>> OpenStack-dev at lists.openstack.org 
>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>
>>>>>>
>
>>>>>>> 
_______________________________________________
>>>>>> OpenStack-dev mailing list
>>>>>> OpenStack-dev at lists.openstack.org 
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>
>>>>>> 
_______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org 
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>> 
_______________________________________________ OpenStack-dev
>>>> mailing list OpenStack-dev at lists.openstack.org 
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>>> 
_______________________________________________
>> OpenStack-dev mailing list OpenStack-dev at lists.openstack.org 
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>> 
> 
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTsUh6AAoJEC5aWaUY1u57z0UIAKGr5CvRbq7Nez5y83/zEmRF
7C4rQV/9YjlneCVSnYGvZSm4/eUOYpIZ+qntm89ASfv8ilhBuM7MgUxBEouEdgmT
aI3lYGus7BrjXFxIiOEWm5tx5fiV3rSaUF+7OnG8iVFrAni/gN6+9fU68Ts03u0/
h/Xhn3rJiiXcTP7co/pbVawUivmfipjutEL37k4UrB+OcpkHHOEWrhwyCd4rHwrT
7psblq+bI52rxsVWR1zYm2FuRMCwO2dFb8dTblcks5CGlCbQJk1LRt2SGYXKvn0I
mS3tKBC6vJR2F2LteJV2A92OxF4skSyyGO6uQTwtQAln/r8AAQ5PFjcYBG/TQRQ=
=f/eB
-----END PGP SIGNATURE-----



More information about the OpenStack-dev mailing list