[openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

Kurt Griffiths kurt.griffiths at rackspace.com
Wed Mar 19 00:00:23 UTC 2014


I think we can agree that a data-plane API only makes sense if it is
useful to a large number of web and mobile developers deploying their apps
on OpenStack. Also, it only makes sense if it is cost-effective and
scalable for operators who wish to deploy such a service.

Marconi was born of practical experience and direct interaction with
prospective users. When Marconi was kicked off a few summits ago, the
community was looking for a multi-tenant messaging service to round out
the OpenStack portfolio. Users were asking operators for something easier
to work with and more web-friendly than established options such as AMQP.

To that end, we started drafting an HTTP-based API specification that
would afford several different messaging patterns, in order to support the
use cases that users were bringing to the table. We did this completely in
the open, and received lots of input from prospective users familiar with
a variety of message broker solutions, including more “cloudy” ones like
SQS and Iron.io.

The resulting design was a hybrid that supported what you might call
“claim-based” semantics ala SQS and feed-based semantics ala RSS.
Application developers liked the idea of being able to use one or the
other, or combine them to come up with new patterns according to their
needs. For example:

1. A video app can use Marconi to feed a worker pool of transcoders. When
a video is uploaded, it is stored in Swift and a job message is posted to
Marconi. Then, a worker claims the job and begins work on it. If the
worker crashes, the claim expires and the message becomes available to be
claimed by a different worker. Once the worker is finished with the job,
it deletes the message so that another worker will not process it, and
claims another message. Note that workers never “list” messages in this
use case; those endpoints in the API are simply ignored.

2. A backup service can use Marconi to communicate with hundreds of
thousands of backup agents running on customers' machines. Since Marconi
queues are extremely light-weight, the service can create a different
queue for each agent, and additional queues to broadcast messages to all
the agents associated with a single customer. In this last scenario, the
service would post a message to a single queue and the agents would simply
list the messages on that queue, and everyone would get the same message.
This messaging pattern is emergent, and requires no special routing setup
in advance from one queue to another.

3. A metering service for an Internet application can use Marconi to
aggregate usage data from a number of web heads. Each web head collects
several minutes of data, then posts it to Marconi. A worker periodically
claims the messages off the queue, performs the final aggregation and
processing, and stores the results in a DB. So far, this messaging pattern
is very much like example #1, above. However, since Marconi’s API also
affords the observer pattern via listing semantics, the metering service
could run an auditor that logs the messages as they go through the queue
in order to provide extremely valuable data for diagnosing problems in the
aggregated data.

Users are excited about what Marconi offers today, and we are continuing
to evolve the API based on their feedback.

Of course, app developers aren’t the only audience Marconi needs to serve.
Operators want something that is cost-effective, scales, and is
customizable for the unique needs of their target market.

While Marconi has plenty of room to improve (who doesn’t?), here is where
the project currently stands in these areas:

1. Customizable. Marconi transport and storage drivers can be swapped out,
and messages can be manipulated in-flight with custom filter drivers.
Currently we have MongoDB and SQLAlchemy drivers, and are exploring Redis
and AMQP brokers. Now, the v1.0 API does impose some constraints on the
backend in order to support the use cases mentioned earlier. For example,
an AMQP backend would only be able to support a subset of the current API.
Operators occasionally ask about AMQP broker support, in particular, and
we are exploring ways to evolve the API in order to support that.

2. Scalable. Operators can use Marconi’s HTTP transport to leverage their
existing infrastructure and expertise in scaling out web heads. When it
comes to the backend, for small deployments with minimal throughput needs,
we are providing a SQLAlchemy driver as a non-AGPL alternative to MongoDB.
For large-scale production deployments, we currently provide the MongoDB
driver and will likely add Redis as another option (there is already a POC
driver). And, of course, operators can provide drivers for NewSQL
databases, such as VelocityDB, that are very fast and scale extremely
well. In Marconi, every queue can be associated with a different backend
cluster. This allows operators to scale both up and out, according to what
is most cost-effective for them. Marconi's app-level sharding is currently
done using a lookup table to provide for maximum operator control over
placement, but I personally think it would be great to see this opened up
so that we can swap in other types of drivers, such as one based on hash
rings (TBD).  

3. Cost-effective. The Marconi team has done a lot of work to (1) provide
several dimensions for scaling deployments that can be used according to
what is most cost-effective for a given use case, and (2) make the Marconi
service as efficient as possible, including time spent optimizing the
transport layer (using Falcon in lieu of Pecan, reducing the work that the
request handlers do, etc.), and tuning the MongoDB storage driver (the
SQLAlchemy driver is newer and we haven’t had the chance to tune it yet,
but are planning to do so during Juno). Turnaround on requests is in the
low ms range (including dealing with HTTP), not the usec range, but that
works perfectly well for a large class of applications. We’ve been
benchmarking with Tsung for quite a while now, and we are working on
making the raw data more accessible to folks outside our team. I’ll try to
get some of the latest data up on the wiki this week.

Marconi was originally incubated because the community believed developers
building their apps on top of OpenStack were looking for this kind of
service, and it was a big missing gap in our portfolio. Since that time,
the team has worked hard to fill that gap.

Kurt



More information about the OpenStack-dev mailing list