[openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

Devananda van der Veen devananda.vdv at gmail.com
Wed Mar 19 19:37:36 UTC 2014

Let me start by saying that I want there to be a constructive discussion
around all this. I've done my best to keep my tone as non-snarky as I could
while still clearly stating my concerns. I've also spent a few hours
reviewing the current code and docs. Hopefully this contribution will be
beneficial in helping the discussion along.

For what it's worth, I don't have a clear understanding of why the Marconi
developer community chose to create a new queue rather than an abstraction
layer on top of existing queues. While my lack of understanding there isn't
a technical objection to the project, I hope they can address this in the
aforementioned FAQ.

The reference storage implementation is MongoDB. AFAIK, no integrated
projects require an AGPL package to be installed, and from the discussions
I've been part of, that would be a show-stopper if Marconi required
MongoDB. As I understand it, this is why sqlalchemy support was required
when Marconi was incubated. Saying "Marconi also supports SQLA" is
disingenuous because it is a second-class citizen, with incomplete API
support, is clearly not the recommended storage driver, and is going to be
unusuable at scale (I'll come back to this point in a bit).

Let me ask this. Which back-end is tested in Marconi's CI? That is the
back-end that matters right now. If that's Mongo, I think there's a
problem. If it's SQLA, then I think Marconi should declare any features
which SQLA doesn't support to be optional extensions, make SQLA the
default, and clearly document how to deploy Marconi at scale with a SQLA

Then there's the db-as-a-queue antipattern, and the problems that I have
seen result from this in the past... I'm not the only one in the OpenStack
community with some experience scaling MySQL databases. Surely others have
their own experiences and opinions on whether a database (whether MySQL or
Mongo or Postgres or ...) can be used in such a way _at_scale_ and not fall
over from resource contention. I would hope that those members of the
community would chime into this discussion at some point. Perhaps they'll
even disagree with me!

A quick look at the code around claim (which, it seems, will be the most
commonly requested action) shows why this is an antipattern.

The MongoDB storage driver for claims requires _four_ queries just to get a
message, with a serious race condition (but at least it's documented in the
code) if multiple clients are claiming messages in the same queue at the
same time. For reference:


The SQLAlchemy storage driver is no better. It's issuing _five_ queries
just to claim a message (including a query to purge all expired claims
every time a new claim is created). The performance of this transaction
under high load is probably going to be bad...


Lastly, it looks like the Marconi storage drivers assume the storage
back-end to be infinitely scalable. AFAICT, the mongo storage driver
supports mongo's native sharding -- which I'm happy to see -- but the SQLA
driver does not appear to support anything equivalent for other back-ends,
eg. MySQL. This relegates any deployment using the SQLA backend to the
scale of "only what one database instance can handle". It's unsuitable for
any large-scale deployment. Folks who don't want to use Mongo are likely to
use MySQL and will be promptly bitten by Marconi's lack of scalability with
this back end.

While there is a lot of room to improve the messaging around what/how/why,
and I think a FAQ will be very helpful, I don't think that Marconi should
graduate this cycle because:
(1) support for a non-AGPL-backend is a legal requirement [*] for Marconi's
(2) deploying Marconi with sqla+mysql will result in an incomplete and
unscalable service.

It's possible that I'm wrong about the scalability of Marconi with sqla +
mysql. If anyone feels that this is going to perform blazingly fast on a
single mysql db backend, please publish a benchmark and I'll be very happy
to be proved wrong. To be meaningful, it must have a high concurrency of
clients creating and claiming messages with (num queues) << (num clients)
<< (num messages), and all clients polling on a reasonably short interval,
based on what ever the recommended client-rate-limit is. I'd like the test
to be repeated with both Mongo and SQLA back-ends on the same hardware for


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140319/a1ee2d0b/attachment.html>

More information about the OpenStack-dev mailing list