[openstack-dev] [oslo.messaging][zeromq] Next step

Alec Hothan (ahothan) ahothan at cisco.com
Thu May 28 17:07:23 UTC 2015


Hi Oleksii,

Thanks for putting together the slides, they are well done and extremely useful!

I find this 0MQ driver redesign proposal a much needed improvement over the current design.
However it is worth debating the need to keep the proxy server and I would be interested to hear from others as well if they feel like this is something we should pursue.
Also do we know the level of interest we have in the community to contribute to, use or support a production grade 0MQ driver in the future?

Comments inline...


From: ozamiatin <ozamiatin at mirantis.com<mailto:ozamiatin at mirantis.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Wednesday, May 27, 2015 at 3:52 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [oslo.messaging][zeromq] Next step

Hi,

I'll try to address the question about Proxy process.

AFAIK there is no way yet in zmq to bind more than once to a specific port (e.g. tcp://*:9501).

Apparently we can:

socket1.bind('tcp://node1:9501')
socket2.bind('tcp://node2:9501')

but we can not:

socket1.bind('tcp://*:9501')
socket2.bind('tcp://*:9501')

So if we would like to have a definite port assigned with the driver we need to use a proxy which receives on a single socket and redirects to a number of sockets.

Right you can only bind once on a given ip/port.



It is a normal practice in zmq to do so. There are even some helpers implemented in the library so-called 'devices'.

Here the performance question is relevant. According to ZeroMQ documentation [1] The basic heuristic is to allocate 1 I/O thread in the context for every gigabit per second of data that will be sent and received (aggregated).

The other way is to 'bind_to_random_port', but here we need some mechanism to notify the client about the port we are listening to. So it is more complicated solution.

It is all relative and I actually find it simpler overall than using a proxy ;-)
Dynamic port binding has some benefits as well, is widely used and is a well known/understood pattern.
In the current implementation, messaging end points register their topic to Redis with the host address (and implicitly on the well known port 5901), it would have been possible to register the host+port instead if we were to consider bypassing the proxy.


Why to run in a separate process? For zmq api it doesn't matter to communicate between threads (INPROC), between processes (IPC) or between nodes (TCP, PGM and others). Because we need to run proxy once on a node it's easier to do it in a separate process. How to track the proxy is running already if we put it in a thread of some service?

There would not be any proxy at all. Each end point (nova compute, ovs agent, neutron server...) would simply listen on its unique IP+port (that's real peer to peer)




In spite of having a broker-like instance locally we still stay brokerless because we have no central broker node with a queue we need to replicate and keep alive. Each node is acutally a peer. The broker is not a standalone node so we can not say that it is a 'single point of failure' .

You are correct in that regard (i.e. "node" level), but there is also process level HA (meaning if the proxy process goes down for whatever reason, all end points in that node become unreachable). One complication to keep in mind is that some production deployment schemes (those that use a container or a VM to deploy openstack services) would have to accommodate that extra proxy process (in the same way that they do today with rabbitMQ clusters) by adding one extra service "container" just for the proxy which is pretty heavy handed since you can't bundle that proxy process with any other container. For example a typical compute node would have 2 "packages" (nova-compute and ovs agent for example), with that proxy there will be a need to have a third package.


We can consider the local broker as a part of a server. It is worth noting that IPC communication is much more reliable than real network communication.
One more benefit is that the proxy is stateless so we don't have to bother about managing the state (syncing it or having enough memory to keep it)

I agree the proxy server is not very complex and likely solid, but there are also downsides to it as well.
Talking about TCP sockets vs. IPC sockets (which are probably based on unix domain sockets), you'll have to note that you won't be able to use IPC if the proxy server will have to run inside a VM by itself (as would be the case for certain deployment schemes), and you'd have to use TCP sockets in that case (with the extra burden of connecting the end points to the proxy server while they all can potentially reside in different Vms - you cannot just use 127.0.0.1  to find the local proxy server).



I'll cite the zmq-guide about broker/brokerless (4.14. Brokerless Reliability p.221):

"It might seem ironic to focus so much on broker-based reliability, when we often explain ØMQ as "brokerless messaging". However, in messaging, as in real life, the middleman is both a burden and a benefit. In practice, most messaging architectures benefit from a mix of distributed and brokered messaging. "

Brokers and middlemen are beneficial in many situations no question about it. In this particular situation there is actually already a "broker" of sort which is the redis server ;-) The redis server acts like a name server and allows dynamic discovery of services (topics and the associated addresses).
Brokers are interesting for 2 reasons:

  *   decouple participants of a communication infra (e.g. you do not want to hardcode anything about peers, their count and their addresses), this can be done by a broker a la 0MQ example or by a name server
  *   do something special about messages that are being brokered (e.g. persistence, replication, multicast, load balancing etc...), things that you can't do with simple peer to peer connections and things where RabbitMQ (presumably) excels

Given that the proxy server does not seem to do anything special with the messages (other than forwarding/unicasting) and given that Redis could provide a full end to end addressing, it seems that the need for a proxy is greatly diminished.
A list of drawbacks in having a proxy server in every node:

  *   potential deployment complications as noted above,
  *   total connection count is higher compared to proxy-less design - this could be a problem if we ever get to the point of encrypting every connection (btw 0MQ supports encryption since version 3 although I have not personally tried it)
  *   1 more hop for every message in both directions
  *   not clear if the proxy would propagate disconnection events from one side to the other?
  *   need to tend to the buffering and flow control in the proxy (one policy may not fit all needs and will it still be state-less)

Lastly, note that the Redis server itself could be clustered for HA (a feature added recently) and this might be something we may have to look at as well because it is another point of failure (it would be awkward to put the redis server on 1 controller node where HA calls for 3 controller nodes for example).

I'm still relatively new to oslo messaging and still have a lot of questions regarding a deployment based on 0MQ. I think it is important that we assess properly the forces in favor of this protocol and make sure it does provide a better option than rabbitMQ at production scale using measurable evidence.

Thanks

  Alec







Thanks,
Oleksii


1 - http://zeromq.org/area:faq#toc7


5/26/15 18:57, Davanum Srinivas пишет:

Alec,

Here are the slides:
http://www.slideshare.net/davanum/oslomessaging-new-0mq-driver-proposal

All the 0mq patches to date should be either already merged in trunk
or waiting for review on trunk.

Oleksii, Li Ma,
Can you please address the other questions?

thanks,
Dims

On Tue, May 26, 2015 at 11:43 AM, Alec Hothan (ahothan)
<ahothan at cisco.com><mailto:ahothan at cisco.com> wrote:


Looking at what is the next step following the design summit meeting on
0MQ as the etherpad does not provide too much information.
Few questions:
- would it be possible to have the slides presented (showing the proposed
changes in the 0MQ driver design) to be available somewhere?
- is there a particular branch in the oslo messaging repo that contains
0MQ related patches - I'm more particularly interested by James Page's
patch to pool the 0MQ connections but there might be other
- question for Li Ma, are you deploying with the straight upstream 0MQ
driver or with some additional patches?

The per node proxy process (which is itself some form of broker) needs to
be removed completely if the new solution is to be made really
broker-less. This will also eliminate the only single point of failure in
the path and reduce the number of 0MQ sockets (and hops per message) by
half.

I think it was proposed that we go on with the first draft of the new
driver (which still keeps the proxy server but reduces the number of
sockets) before eventually tackling the removal of the proxy server?



Thanks

  Alec



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<mailto:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150528/29d8beff/attachment.html>


More information about the OpenStack-dev mailing list