[openstack-dev] [Oslo] First steps towards amqp 1.0

Gordon Sim gsim at redhat.com
Tue Dec 10 12:49:07 UTC 2013


On 12/10/2013 12:36 AM, Mike Wilson wrote:
> This is the first time I've heard of the dispatch router, I'm really
> excited now that I've looked at it a bit. Thx Gordon and Russell for
> bringing this up. I'm very familiar with the scaling issues associated
> with any kind of brokered messaging solution. We grew an Openstack
> installation to about 7,000 nodes and started having significant scaling
> issues with the qpid broker. We've talked about our problems at a couple
> summits in a fair amount of detail[1][2]. I won't bother repeating the
> information in this thread.
>
> I really like the idea of separating the logic of routing away from the
> the message emitter. Russell mentioned the 0mq matchmaker, we
> essentially ditched the qpid broker for direct communication via 0mq and
> it's matchmaker. It still has a lot of problems which dispatch seems to
> address. For example, in ceilometer we have store-and-forward behavior
> as a requirement. This kind of communication requires a broker but 0mq
> doesn't really officially support one, which means we would probably end
> up with some broker as part of OpenStack. Matchmaker is also a fairly
> basic implementation of what is essentially a directory. For any sort of
> serious production use case you end up sprinkling JSON files all over
> the place or maintaining a Redis backend. I feel like the matchmaker
> needs a bunch more work to make modifying the directory simpler for
> operations. I would rather put that work into a separate project like
> dispatch than have to maintain essentially a one off in Openstack's
> codebase.
>
> I wonder how this fits into messaging from a driver perspective in
> Openstack or even how this fits into oslo.messaging? Right now we have
> topics for binaries(compute, network, consoleauth, etc),
> hostname.service_topic for nodes, fanout queue per node (not sure if
> kombu also has this) and different exchanges per project. If we can
> abstract the routing from the emission of the message all we really care
> about is emitter, endpoint, messaging pattern (fanout, store and
> forward, etc). Also not sure if there's a dispatch analogue in the
> rabbit world, if not we need to have some mapping of concepts etc
> between impls.
>
> So many questions, but in general I'm really excited about this and
> eager to contribute. For sure I will start playing with this in
> Bluehost's environments that haven't been completely 0mqized. I also
> have some lingering concerns about qpid in general. Beyond scaling
> issues I've run into some other terrible bugs that motivated our move
> away from it. Again, these are mentioned in our presentations at summits
> and I'd be happy to talk more about them in a separate discussion. I've
> also been able to talk to some other qpid+openstack users who have seen
> the same bugs. Another large installation that comes to mind is Qihoo
> 360 in China. They run a few thousand nodes with qpid for messaging and
> are familiar with the snags we run into.
>
> Gordon,
>
> I would really appreciate if you could watch those two talks and
> comment. The bugs are probably separate from the dispatch router
> discussion, but it does dampen my enthusiasm a bit not knowing how to
> fix issues beyond scale :-(.

Mike (and others),

First, as a Qpid developer, let me apologise for the frustrating 
experience you have had.

The qpid components used here are not the most user friendly, it has to 
be said. They work well for the paths most usually taken, but there can 
be some unanticipated problems outside that.

The main failing I think is that we in the Qpid community did not get 
involved in OpenStack to listen, understand the use cases and to help 
diagnose and address problems earlier. I joined this list specifically 
to try and rectify that failing.

The specific issues I gleaned from the presentations were:

(a) issues with eventlet and qpid.messaging integration

The qpid.messaging library does some perhaps quirky things that made the 
monkey patched solution more awkward. The openstack rpc implementation 
over qpid was heavily driven by the kombu base rabbitmq implementation, 
although the client libraries are quite different in design. The 
addressing syntax for the qpid.messaging library is not always the most 
intuitive.

As suggested in another mail on this thread, for an AMQP 1.0 based 
driver I would pick an approach that allows olso.messaging to retain 
control over threading choices etc to avoid some of these sorts of 
integration issues, and program more directly to the protocol.

(b) general scaling issues with standalone qpidd instance,

As you point out very clearly, a single broker is always going to be a 
bottleneck. Further there are some aspects of the integration code that 
I think unnecessarily reduce performance. E.g. each call and cast is 
synchronous, only a single message of prefetch is enabled for any 
subscription (forcing more roundtrips), senders and receivers are 
created for every request and reply  etc.

(c) message loss,

The code I have studied doesn't enable acknowledgements for messages. 
This may be what is desired in some cases, but perhaps not in others 
(e.g. if you want reliable delivery of notifications even when there are 
connection failures). In addition heartbeats aren't enabled which can 
lead to long timeouts for any host failures. It is hard to be very 
concrete without a little more detail, however I am generally pretty 
confident around the robustness of qpidd's handling of acknowledged 
reliable delivery.

(d) problems multiplied by using qpidd clustering,

With the benefit of hindsight, the cluster solution was a flawed design. 
We did actually realise this early on, but spent a lot of effort trying 
to keep patching it up. The design is based around replicated state 
machines, but this reduces concurrency and thus scalability, since the 
events need to be processed in order to guarantee the same state. The 
boundary of the state machine was also chose to be too wide. It 
encompassed lots of state that was not completely deterministic which 
caused loss of consistency in the replicated state. Though it was 
active-active, it was never a solution to scale, only a mechanism for 
broker state replication, but this was not communicated by us 
sufficiently clearly. And again, it really wasn't designed with the 
usage patterns of openstack in mind.

I'm also aware of an issue where exchanges are created and never 
deleted. This was due to a missing feature in qpidd - namely autodeleted 
exchanges - which is now fixed upstream. We hadn't ever had anyone 
request that so it got stuck on the todo list for too long.

I'd be very keen to hear about any other issues you encountered to 
identify the causes, fix bugs and most importantly ensure the necessary 
lessons are applied to any further developments.

I'm also delighted that the concepts behind Dispatch Router are of 
interest and it would be fantastic to have you (and anyone else with an 
interest) involved. I myself am very interested in helping in whatever 
way I can to find the ideal solution for OpenStacks messaging needs, 
whatever that is, using all the insights and experience gained from you 
and others. As you say, the communication mechanisms are of central and 
critical importance to the overall functioning of the system.

Just for context, Qpid is an Apache project that was formed to help 
promote AMQP based solutions and encompasses numerous different 
components. It is open to all who are interested in contributing, 
whether it be in the form of feedback or code or anything else and the 
rules of governance are there to ensure that it serves the needs of the 
community as a whole.

--Gordon.

>
> -Mike Wilson
>
> [1]
> http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
> [2]
> http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/going-brokerless-the-transition-from-qpid-to-0mq




More information about the OpenStack-dev mailing list