[openstack-dev] [oslo][messaging] Further improvements and refactoring

Alexei Kornienko alexei.kornienko at gmail.com
Tue Jul 1 13:55:44 UTC 2014


Hi,

Thanks for detailed answer.
Please see my comments inline.

Regards,

On 07/01/2014 04:28 PM, Ihar Hrachyshka wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On 30/06/14 21:34, Alexei Kornienko wrote:
>> Hello,
>>
>>
>> My understanding is that your analysis is mostly based on running
>> a profiler against the code. Network operations can be bottlenecked
>> in other places.
>>
>> You compare 'simple script using kombu' with 'script using
>> oslo.messaging'. You don't compare script using oslo.messaging
>> before refactoring and 'after that. The latter would show whether
>> refactoring was worth the effort. Your test shows that
>> oslo.messaging performance sucks, but it's not definite that
>> hotspots you've revealed, once fixed, will show huge boost.
>>
>> My concern is that it may turn out that once all the effort to
>> refactor the code is done, we won't see major difference. So we
>> need base numbers, and performance tests would be a great helper
>> here.
>>
>>
>> It's really sad for me to see so little faith in what I'm saying.
>> The test I've done using plain kombu driver was needed exactly to
>> check that network is not the bottleneck for messaging
>> performance. If you don't believe in my performance analysis we
>> could ask someone else to do their own research and provide
>> results.
> Technology is not about faith. :)
>
> First, let me make it clear I'm *not* against refactoring or anything
> that will improve performance. I'm just a bit skeptical, but hopefully
> you'll be able to show everyone I'm wrong, and then the change will
> occur. :)
>
> To add more velocity to your effort, strong arguments should be
> present. To facilitate that, I would start from adding performance
> tests that would give us some basis for discussion of changes proposed
> later.
Please see below for detailed answer about performance tests implementation.
It explains a bit why it's hard to present arguments that would be 
strong enough for you.
I may run performance tests locally but it's not enough for community.

And in addition I've provided some links to existing implementation with 
places that IMHO cause bottlenecks.
 From my point of view that code is doing obviously stupid things (like 
closing/opening sockets for each message sent).
That is enough for me to rewrite it even without additional proofs that 
it's wrong.
>
> Then, describing proposed details in a spec will give more exposure to
> your ideas. At the moment, I see general will to enhance the library,
> but not enough details on how to achieve this. Specification can make
> us think not about the burden of change that obviously makes people
> skeptic about rewrite-all approach, but about specific technical issues.
I agree that we should start with a spec. However instead of having spec 
of needed changes I would prefer to have a spec describing needed 
functionality of the library (it may differ from existing functionality).
Using such a spec we could decide what it needed and what needs to be 
removed to achieve what we need.
>
>> Problem with refactoring that I'm planning is that it's not a
>> minor refactoring that can be applied in one patch but it's the
>> whole library rewritten from scratch.
> You can still maintain a long sequence of patches, like we did when we
> migrated neutron to oslo.messaging (it was like ~25 separate pieces).
Talking into account possible gate issues I would like to avoid long 
series of patches since they won't be able to land at the same time and 
rebasing will become a huge pain.
If we decide to start working on 2.0 API/implementation I think a topic 
branch 2.0a would be much better.
>
>> Existing messaging code was written long long time ago (in a galaxy
>> far far away maybe?) and it was copy-pasted directly from nova. It
>> was not built as a library and it was never intended to be used
>> outside of nova. Some parts of it cannot even work normally cause
>> it was not designed to work with drivers like zeromq (matchmaker
>> stuff).
> oslo.messaging is NOT the code you can find in oslo-incubator rpc
> module. It was hugely rewritten to expose a new, cleaner API. This is
> btw one of the reasons migration to this new library is so painful. It
> was painful to move to oslo.messaging, so we need clear need for a
> change before switching to yet another library.
API indeed has changed but general implementation details and processing 
flow goes way back to 2011 and nova code (for example general 
Publisher/Consumer implementation in impl_rabbit)
That's the code I'm talking about.

Refactoring as I see it will do the opposite thing. It will keep intact 
as much API as possible but change internals to make it more efficient 
(that's why I call it refactoring) So 2.0 version might be (partially?) 
backwards compatible and migration won't be such a pain.
>
>> The reason I've raised this question on the mailing list was to get
>> some agreement about future plans of oslo.messaging development and
>> start working on it in coordination with community. For now I don't
>> see any actions plan emerging from it. I would like to see us
>> bringing more constructive ideas about what should be done.
>>
>> If you think that first action should be profiling lets discuss how
>> it should be implemented (cause it works for me just fine on my
>> local PC). I guess we'll need to define some basic scenarios that
>> would show us overall performance of the library.
> Let's start from basic send/receive throughput, for tiny and large
> messages, multiple consumers etc.
This would be a great start but it's quite hard to test basic 
send/receive since existing code is written around rpc.
I don't see a way to send a message without complex rpc code being involved.
That's why I propose to start refactoring that would separate rpc code 
from basic messaging code.
>
>> There are a lot of questions that should be answered to implement
>> this: Where such tests would run (jenking, local PC, devstack VM)?
> I would expect it to be exposed to jenkins thru 'tox'. We then can set
> up a separate job to run them and compare with a base line [TBD: what
> *is* baseline?] to make sure we don't introduce performance regressions.
Such tests cannot be exposed thru 'tox' since they require some 
environment setup (rabbitmq-server, zeromq matchmaker, etc.). Such setup 
is way out of scope for tox.
Cause of this we should find some other way to run such tests.
>
>> How such scenarios should look like? How do we measure performance
>> (cProfile, etc.)?
> I think we're interested in message rate, not CPU utilization.
Problem here that it's hard to find bottleneck in message rate without 
deeper analysis (cpu utilization, etc.)
>
>> How do we collect results? How do we analyze results to find
>> bottlenecks? etc.
>>
>> Another option would be to spend some of my free time implementing
>> mentioned refactoring (as I see it) and show you the results of
>> performance testing compared with existing code.
> This approach generally doesn't work beyond PoC. Openstack is a
> complex project, and we need to stick to procedures - spec review,
> then coding, all in upstream, with no private branches outside common
> infrastructure.
I agree with such approach but it also has many drawbacks. I don't know 
a clean way to communicate design drafts and implementation details 
without actually writing the code.
And if you already have a working code all this spec review becomes 
quite a useless burden.
If you know a way how to solve this problem (creating high-mid level 
architecture design) please share it with me so we could use it.
>
>> The only problem with such approach is that my code won't be
>> oslo.messaging and it won't be accepted by community. It may be
>> drop in base for v2.0 but I'm afraid this won't be acceptable
>> either.
>>
> Future does not occur here that way. If you want your work to be
> consumed by community, you need to work with it.
That's what I'm trying to do :)
>
>> Regards, Alexei Kornienko
>>
>>
>> 2014-06-30 17:51 GMT+03:00 Gordon Sim <gsim at redhat.com
>> <mailto:gsim at redhat.com>>:
>>
>> On 06/30/2014 12:22 PM, Ihar Hrachyshka wrote:
>>
>> Alexei Kornienko wrote:
>>
>> Some performance tests may be introduced but they would be more
>> like functional tests since they require setup of actual messaging
>> server (rabbit, etc.).
>>
>>
>> Yes. I think we already have some. F.e.
>> tests/drivers/test_impl_qpid.__py attempts to use local Qpid
>> server (backing up to fake server if it's not available).
>>
>>
>> I always get failures when there is a real qpidd service listening
>> on the expected port. Does anyone else see this?
>>
>>
>>
>> _________________________________________________ OpenStack-dev
>> mailing list OpenStack-dev at lists.openstack.__org
>> <mailto:OpenStack-dev at lists.openstack.org>
>> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>>
>>
>>
>>
>>
>> _______________________________________________ OpenStack-dev
>> mailing list OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBCgAGBQJTsreKAAoJEC5aWaUY1u57f/EIAOBzJ4dGKViBcg22DOP5dmeH
> jRRb9T2RDABpMRwtGkYlWSIyaP6f/eeXP9+9LQrMKkw7hlg6U50d+UmHCD18w0/8
> gM/n6CpX/RPb5WmO3oyIol5kPnZo/ZVH2O6FEaS+0vwIdBDMwt5hOIFzA+AB4ZXM
> n9PG0OnGrRIEQSBiJ6N0ujSnNiLisH59odKmw4B3mFjvfwiFUdY1cWqNlAMm7J0e
> J7bu/eocEbvftff4y/Jh5DFx8S3pKpJUby7WgWc1WsOqkD/wyKLYIc/2WyB9CI08
> SiMB4MnRNvJ95lSnmZNsgSXAct5qze0/fe/IC5+lCiM6L7tzt8bLYx+j4IrLzsI=
> =9L6r
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list