[openstack-dev] [nova] readout from Philly Operators Meetup

Clint Byrum clint at fewbar.com
Thu Mar 12 16:47:38 UTC 2015

Excerpts from Sean Dague's message of 2015-03-11 05:59:10 -0700:
> =============================
>  Additional Interesting Bits
> =============================
> Rabbit
> ------
> There was a whole session on Rabbit -
> https://etherpad.openstack.org/p/PHL-ops-rabbit-queue
> Rabbit is a top operational concern for most large sites. Almost all
> sites have a "restart everything that talks to rabbit" script because
> during rabbit ha opperations queues tend to blackhole.
> All other queue systems OpenStack supports are worse than Rabbit (from
> experience in that room).
> oslo.messaging < 1.6.0 was a significant regression in dependability
> from the incubator code. It now seems to be getting better but still a
> lot of issues. (L112)
> Operators *really* want the concept in
> https://review.openstack.org/#/c/146047/ landed. (I asked them to
> provide such feedback in gerrit).

This reminded me that there are other options that need investigation.

A few of us have been looking at what it might take to use something
in between RabbitMQ and ZeroMQ for RPC and notifications. Some initial
forays into inspecting Gearman (which infra has successfully used for
quite some time as the backend of Zuul) look promising. A few notes:

* The Gearman protocol is crazy simple. There are currently 4 known gearman
  server implementations: Perl, Java, C, and Python (written and
  maintained by our own infra team). http://gearman.org/download/ for
  the others, and https://pypi.python.org/pypi/gear for the python one.

* Gearman has no pub/sub capability built in for 1:N comms. However, it
  is fairly straight forward to write workers that will rebroadcast
  messages to subscribers.

* Gearman's security model is not very rich. Mostly, if you have been
  authenticated to the gearman server (only the C server actually even
  supports any type of authentication, via SSL client certs), you can
  do whatever you want including consuming all the messages in a queue
  or filling up a queue with nonsense. This has been raised as a concern
  in the past and might warrant extra work to add support to the python
  server and/or add ACL support.

Part of our motivation for this is that some of us are going to be
deploying a cloud soon and none of us are excited about deploying and
supporting RabbitMQ. So we may be proposing specs to add Gearman as a
deployment option soon.

More information about the OpenStack-dev mailing list