[openstack-dev] [oslo][messaging] Further improvements and refactoring

Ihar Hrachyshka ihrachys at redhat.com
Tue Jul 1 14:52:21 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/07/14 15:55, Alexei Kornienko wrote:
> Hi,
> 
> Thanks for detailed answer. Please see my comments inline.
> 
> Regards,
> 
> On 07/01/2014 04:28 PM, Ihar Hrachyshka wrote: On 30/06/14 21:34,
> Alexei Kornienko wrote:
>>>> Hello,
>>>> 
>>>> 
>>>> My understanding is that your analysis is mostly based on
>>>> running a profiler against the code. Network operations can
>>>> be bottlenecked in other places.
>>>> 
>>>> You compare 'simple script using kombu' with 'script using 
>>>> oslo.messaging'. You don't compare script using
>>>> oslo.messaging before refactoring and 'after that. The latter
>>>> would show whether refactoring was worth the effort. Your
>>>> test shows that oslo.messaging performance sucks, but it's
>>>> not definite that hotspots you've revealed, once fixed, will
>>>> show huge boost.
>>>> 
>>>> My concern is that it may turn out that once all the effort
>>>> to refactor the code is done, we won't see major difference.
>>>> So we need base numbers, and performance tests would be a
>>>> great helper here.
>>>> 
>>>> 
>>>> It's really sad for me to see so little faith in what I'm
>>>> saying. The test I've done using plain kombu driver was
>>>> needed exactly to check that network is not the bottleneck
>>>> for messaging performance. If you don't believe in my
>>>> performance analysis we could ask someone else to do their
>>>> own research and provide results.
> Technology is not about faith. :)
> 
> First, let me make it clear I'm *not* against refactoring or
> anything that will improve performance. I'm just a bit skeptical,
> but hopefully you'll be able to show everyone I'm wrong, and then
> the change will occur. :)
> 
> To add more velocity to your effort, strong arguments should be 
> present. To facilitate that, I would start from adding performance 
> tests that would give us some basis for discussion of changes
> proposed later.
>> Please see below for detailed answer about performance tests 
>> implementation. It explains a bit why it's hard to present
>> arguments that would be strong enough for you. I may run
>> performance tests locally but it's not enough for community.

Yes, that's why shipping some tests ready to run with oslo.messaging
can help. Science is about reproducility, right? ;)

> 
>> And in addition I've provided some links to existing
>> implementation with places that IMHO cause bottlenecks. From my
>> point of view that code is doing obviously stupid things (like 
>> closing/opening sockets for each message sent).

That indeed sounds bad.

>> That is enough for me to rewrite it even without additional
>> proofs that it's wrong.

[Full disclosure: I'm not as involved into oslo.messaging internals as
you probably are, so I may speak out dumb things.]

I wonder whether there are easier ways to fix that particular issue
without rewriting everything from scratch. Like, provide a pool of
connections and make send() functions use it instead of creating new
connections (?)

> 
> Then, describing proposed details in a spec will give more exposure
> to your ideas. At the moment, I see general will to enhance the
> library, but not enough details on how to achieve this.
> Specification can make us think not about the burden of change that
> obviously makes people skeptic about rewrite-all approach, but
> about specific technical issues.
>> I agree that we should start with a spec. However instead of
>> having spec of needed changes I would prefer to have a spec
>> describing needed functionality of the library (it may differ
>> from existing functionality).

Meaning, breaking API, again?

>> Using such a spec we could decide what it needed and what needs
>> to be removed to achieve what we need.
> 
>>>> Problem with refactoring that I'm planning is that it's not
>>>> a minor refactoring that can be applied in one patch but it's
>>>> the whole library rewritten from scratch.
> You can still maintain a long sequence of patches, like we did when
> we migrated neutron to oslo.messaging (it was like ~25 separate
> pieces).
>> Talking into account possible gate issues I would like to avoid
>> long series of patches since they won't be able to land at the
>> same time and rebasing will become a huge pain.

But you're the one proposing the change, you need to take burden.
Having a new branch for everything-rewritten version of the library
means that each bug fix or improvement to the library will require
being tracked by each developer in two branches, with significantly
different code. I think it's more honest to put rebase pain on people
who rework the code than on everyone else.

>> If we decide to start working on 2.0 API/implementation I think a
>> topic branch 2.0a would be much better.

I respectfully disagree. See above.

> 
>>>> Existing messaging code was written long long time ago (in a
>>>> galaxy far far away maybe?) and it was copy-pasted directly
>>>> from nova. It was not built as a library and it was never
>>>> intended to be used outside of nova. Some parts of it cannot
>>>> even work normally cause it was not designed to work with
>>>> drivers like zeromq (matchmaker stuff).
> oslo.messaging is NOT the code you can find in oslo-incubator rpc 
> module. It was hugely rewritten to expose a new, cleaner API. This
> is btw one of the reasons migration to this new library is so
> painful. It was painful to move to oslo.messaging, so we need clear
> need for a change before switching to yet another library.
>> API indeed has changed but general implementation details and
>> processing flow goes way back to 2011 and nova code (for example
>> general Publisher/Consumer implementation in impl_rabbit) That's
>> the code I'm talking about.

Roger.

> 
>> Refactoring as I see it will do the opposite thing. It will keep
>> intact as much API as possible but change internals to make it
>> more efficient (that's why I call it refactoring) So 2.0 version
>> might be (partially?) backwards compatible and migration won't be
>> such a pain.

That sounds promising. Though see my concern on your suggestion to
revisit the scope of the library above.

> 
>>>> The reason I've raised this question on the mailing list was
>>>> to get some agreement about future plans of oslo.messaging
>>>> development and start working on it in coordination with
>>>> community. For now I don't see any actions plan emerging from
>>>> it. I would like to see us bringing more constructive ideas
>>>> about what should be done.
>>>> 
>>>> If you think that first action should be profiling lets
>>>> discuss how it should be implemented (cause it works for me
>>>> just fine on my local PC). I guess we'll need to define some
>>>> basic scenarios that would show us overall performance of the
>>>> library.
> Let's start from basic send/receive throughput, for tiny and large 
> messages, multiple consumers etc.
>> This would be a great start but it's quite hard to test basic 
>> send/receive since existing code is written around rpc. I don't
>> see a way to send a message without complex rpc code being 
>> involved. That's why I propose to start refactoring that would
>> separate rpc code from basic messaging code.

Again, removing RPC code from your tests won't mean the library as a
whole will get higher performance. That said, refactoring that would
result in clear separation of layers can be beneficial even without
major performance boost. But that means that we probably should not
put performance concerns as the main reason for rework. I would set
'clean code' as the primary goal.

> 
>>>> There are a lot of questions that should be answered to
>>>> implement this: Where such tests would run (jenking, local
>>>> PC, devstack VM)?
> I would expect it to be exposed to jenkins thru 'tox'. We then can
> set up a separate job to run them and compare with a base line
> [TBD: what *is* baseline?] to make sure we don't introduce
> performance regressions.
>> Such tests cannot be exposed thru 'tox' since they require some 
>> environment setup (rabbitmq-server, zeromq matchmaker, etc.).
>> Such setup is way out of scope for tox. Cause of this we should
>> find some other way to run such tests.

You may just assume server is already set and available thru a common
socket.

> 
>>>> How such scenarios should look like? How do we measure
>>>> performance (cProfile, etc.)?
> I think we're interested in message rate, not CPU utilization.
>> Problem here that it's hard to find bottleneck in message rate
>> without deeper analysis (cpu utilization, etc.)

Tests are not to show spots, they can be used to avoid performance
regressions or support claims about alleged performance gains added by
a patch (or refactoring).

> 
>>>> How do we collect results? How do we analyze results to find 
>>>> bottlenecks? etc.
>>>> 
>>>> Another option would be to spend some of my free time
>>>> implementing mentioned refactoring (as I see it) and show you
>>>> the results of performance testing compared with existing
>>>> code.
> This approach generally doesn't work beyond PoC. Openstack is a 
> complex project, and we need to stick to procedures - spec review, 
> then coding, all in upstream, with no private branches outside
> common infrastructure.
>> I agree with such approach but it also has many drawbacks. I
>> don't know a clean way to communicate design drafts and
>> implementation details without actually writing the code.

You may still have a PoC. You just should not consider it as a final
code, it's there to support the spec case.

>> And if you already have a working code all this spec review
>> becomes quite a useless burden.

It adds context to your spec. And be prepared that lots of code you
write before or after doing spec job *will* be rewritten. :)

>> If you know a way how to solve this problem (creating high-mid
>> level architecture design) please share it with me so we could
>> use it.
> 
>>>> The only problem with such approach is that my code won't be 
>>>> oslo.messaging and it won't be accepted by community. It may
>>>> be drop in base for v2.0 but I'm afraid this won't be
>>>> acceptable either.
>>>> 
> Future does not occur here that way. If you want your work to be 
> consumed by community, you need to work with it.
>> That's what I'm trying to do :)

OK. BTW you can also join Oslo team at #openstack-oslo to discuss your
case and whatnot.

Cheers,
/Ihar

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTssslAAoJEC5aWaUY1u57v3kH/iqLISfDdmtI8bBz9PcMw16P
/aL6ufUyz6bPZVj+sTcjaPZznhcSLaWzDQVk5fSam1yr0yTAs66AG70gkWWcisFY
EY5xwTyXzMeufDfWATsyXGxeZCUZhwIxjKas2UXhnErT2sd7DRtSuQXwDZfmn36V
Q6YsQiwXOZAAEmnadF6w7Bgq2BBI9Pt6p+BN9syj32fvGNLBZuKo8hz1uWyXB14k
m5blNqYVIeMMynTWUXgT7lH0poVtHpBs8hcoKRXGlAuyc5OtX1Dkq+cTIhAO6Tnj
dFK0D1R/g1fAaVuojw12vqEWRUKL1AK1lyrQVKlX9PgU3pDlfch0WxfcFIqWolY=
=YE5H
-----END PGP SIGNATURE-----



More information about the OpenStack-dev mailing list