[openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

Clint Byrum clint at fewbar.com
Tue Sep 9 23:56:03 UTC 2014


Excerpts from Samuel Merritt's message of 2014-09-09 16:12:09 -0700:
> On 9/9/14, 12:03 PM, Monty Taylor wrote:
> > On 09/04/2014 01:30 AM, Clint Byrum wrote:
> >> Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> >>> Greetings,
> >>>
> >>> Last Tuesday the TC held the first graduation review for Zaqar. During
> >>> the meeting some concerns arose. I've listed those concerns below with
> >>> some comments hoping that it will help starting a discussion before the
> >>> next meeting. In addition, I've added some comments about the project
> >>> stability at the bottom and an etherpad link pointing to a list of use
> >>> cases for Zaqar.
> >>>
> >>
> >> Hi Flavio. This was an interesting read. As somebody whose attention has
> >> recently been drawn to Zaqar, I am quite interested in seeing it
> >> graduate.
> >>
> >>> # Concerns
> >>>
> >>> - Concern on operational burden of requiring NoSQL deploy expertise to
> >>> the mix of openstack operational skills
> >>>
> >>> For those of you not familiar with Zaqar, it currently supports 2 nosql
> >>> drivers - MongoDB and Redis - and those are the only 2 drivers it
> >>> supports for now. This will require operators willing to use Zaqar to
> >>> maintain a new (?) NoSQL technology in their system. Before expressing
> >>> our thoughts on this matter, let me say that:
> >>>
> >>>      1. By removing the SQLAlchemy driver, we basically removed the
> >>> chance
> >>> for operators to use an already deployed "OpenStack-technology"
> >>>      2. Zaqar won't be backed by any AMQP based messaging technology for
> >>> now. Here's[0] a summary of the research the team (mostly done by
> >>> Victoria) did during Juno
> >>>      3. We (OpenStack) used to require Redis for the zmq matchmaker
> >>>      4. We (OpenStack) also use memcached for caching and as the oslo
> >>> caching lib becomes available - or a wrapper on top of dogpile.cache -
> >>> Redis may be used in place of memcached in more and more deployments.
> >>>      5. Ceilometer's recommended storage driver is still MongoDB,
> >>> although
> >>> Ceilometer has now support for sqlalchemy. (Please correct me if I'm
> >>> wrong).
> >>>
> >>> That being said, it's obvious we already, to some extent, promote some
> >>> NoSQL technologies. However, for the sake of the discussion, lets assume
> >>> we don't.
> >>>
> >>> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> >>> keep avoiding these technologies. NoSQL technologies have been around
> >>> for years and we should be prepared - including OpenStack operators - to
> >>> support these technologies. Not every tool is good for all tasks - one
> >>> of the reasons we removed the sqlalchemy driver in the first place -
> >>> therefore it's impossible to keep an homogeneous environment for all
> >>> services.
> >>>
> >>
> >> I whole heartedly agree that non traditional storage technologies that
> >> are becoming mainstream are good candidates for use cases where SQL
> >> based storage gets in the way. I wish there wasn't so much FUD
> >> (warranted or not) about MongoDB, but that is the reality we live in.
> >>
> >>> With this, I'm not suggesting to ignore the risks and the extra burden
> >>> this adds but, instead of attempting to avoid it completely by not
> >>> evolving the stack of services we provide, we should probably work on
> >>> defining a reasonable subset of NoSQL services we are OK with
> >>> supporting. This will help making the burden smaller and it'll give
> >>> operators the option to choose.
> >>>
> >>> [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
> >>>
> >>>
> >>> - Concern on should we really reinvent a queue system rather than
> >>> piggyback on one
> >>>
> >>> As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
> >>> brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
> >>> flavor on top. [0]
> >>>
> >>
> >> I think Zaqar is more like SMTP and IMAP than AMQP. You're not really
> >> trying to connect two processes in real time. You're trying to do fully
> >> asynchronous messaging with fully randomized access to any message.
> >>
> >> Perhaps somebody should explore whether the approaches taken by large
> >> scale IMAP providers could be applied to Zaqar.
> >>
> >> Anyway, I can't imagine writing a system to intentionally use the
> >> semantics of IMAP and SMTP. I'd be very interested in seeing actual use
> >> cases for it, apologies if those have been posted before.
> >
> > It seems like you're EITHER describing something called XMPP that has at
> > least one open source scalable backend called ejabberd. OR, you've
> > actually hit the nail on the head with bringing up SMTP and IMAP but for
> > some reason that feels strange.
> >
> > SMTP and IMAP already implement every feature you've described, as well
> > as retries/failover/HA and a fully end to end secure transport (if
> > installed properly) If you don't actually set them up to run as a public
> > messaging interface but just as a cloud-local exchange, then you could
> > get by with very low overhead for a massive throughput - it can very
> > easily be run on a single machine for Sean's simplicity, and could just
> > as easily be scaled out using well known techniques for public cloud
> > sized deployments?
> >
> > So why not use existing daemons that do this? You could still use the
> > REST API you've got, but instead of writing it to a mongo backend and
> > trying to implement all of the things that already exist in SMTP/IMAP -
> > you could just have them front to it. You could even bypass normal
> > delivery mechanisms and do neat things with local injection.
> >
> > I don't care about the NoSQL question on its own. Mongo is fine. Redis
> > is fine. I don't think either has any features for this use case that
> > make a licks worth of difference compared to MySQL or Postgres, but I
> > also don't think they are a PROBLEM in an of themselves.
> >
> > The main thing I care about here is every description I've heard of what
> > zaqar wants to do (which does seem to be getting clearer through this
> > thread) is still well implemented somewhere as an existing scalable
> > service. Is zaqar actually Rabbit with a REST interface? Is it ejabberd
> > with a rest interface? Or is it IMAP/SMTP with a REST interface. You'll
> > note that probably nobody would think a single server that wanted to be
> > both Rabbit AND IMAP/SMTP is a good idea ... at least this is one of the
> > reasons why we all think Microsoft Exchange is a pile of garbage, no?
> >
> > I also worry about the fact that one description of zaqar was used to
> > communicate a need for divergent requirements (it needs to be a
> > high-volume fast message broker/queue - which, btw, sounds more like
> > Rabbit/oslo.messaging and less like what Clint describes above) ... and
> > that's why it wants to use falcon and not pecan and why it wants to use
> > mongo and not SQL. And then what we're doing it reimplementing something
> > like rabbit except in python (again, given as the justification for
> > deviating from how other bits of OpenStack work)
> >
> > BUT - if that's not actually what zaqar is - if it isn't a rabbit
> > replacement and doesn't need to do massive high volume sub-second
> > queuing because what it's actually modeling is a message subscription
> > service that's closer to email than to anything else, then there is
> > nothing about the components that are happily used in the rest of
> > OpenStack that should be precluded from being used. A REST api written
> > in pecan should be fine ... as should an SQL backend, because 99% of all
> > operations are going to be primary key lookups where even a moderately
> > tuned database should be absolutely fine at keeping up.
> >
> > So which is it? Because it sounds like to me it's a thing that actually
> > does NOT need to diverge in technology in any way, but that I've been
> > told that it needs to diverge because it's delivering a different set of
> > features - and I'm pretty sure if it _is_ the thing that needs to
> > diverge in technology because of its feature set, then it's a thing I
> > don't think we should be implementing in python in OpenStack because it
> > already exists and it's called AMQP.
> 
> Whether Zaqar is more like AMQP or more like email is a really strange 
> metric to use for considering its inclusion.
> 
> Let me put on my web application developer's hat. Whenever I've worked 
> on a web app, I've invariably wound up needing HTTP servers, background 
> workers, and some sort of queue to connect the two.
> 
> I've done the thing where I've stored queue entries in the app's 
> database and had the workers poll for jobs; the load adds up 
> surprisingly fast, and it's got some bad positive-feedback failure 
> modes. However, it is nice and durable, so my app doesn't lose messages.
> 
> I've done the thing where I've thrown together a VM and stuck Redis or 
> rabbitmq or beanstalkd on it. That gets me nice, fast queues, but no 
> semblance of reliability. If that one VM dies, all my queued messages 
> are lost.
> 
> Then there's Zaqar, which is this nice HTTP API that I can use for my 
> application's queues. I go and make a couple of POST requests and now 
> I've got some queues for my application to use. My app servers POST 
> messages to their queues, and my background workers sit and make GET 
> requests for messages to process. I can have the whole thing up and 
> running in a few hours. Better yet, I barely have to monitor the thing. 
> I can poll for queue stats every few minutes and alert if the queue gets 
> too full, but that's all I've got to do. I don't have to worry about my 
> queue VM going into swap, or my queue VM's NIC getting saturated, or 
> kernel panics, or automatically promoting rabbitmq slaves to masters, or 
> waking up at 3 AM to fix my app when I lose messages during a rabbitmq 
> promotion, or any of that stuff. Using Zaqar means I can just worry 
> about my application and leave all that other garbage to my cloud provider.

What you just described is the queue pattern I spoke of.

It does not require random access by message ID in any way shape or
form. It is also well served by AMQP. I wonder if people are still
confused by Zaqar's API and architecture because this would be fine for
an architecture if what you describe above were the requirements:

https://www.dropbox.com/s/yonloa9ytlf8fdh/ZaqarQueueOnly.png?dl=0

Just stick a REST shim in front of AMQP that enforces tenant permissions
and maps logical "zaqar queues" to whatever the backend serving queue is.

So why would we need a NoSQL database for the data itself if all we are
doing is shoving messages in and taking them out the other end?



More information about the OpenStack-dev mailing list