[openstack-dev] [Zaqar] Zaqar and SQS Properties of Distributed Queues

Gordon Sim gsim at redhat.com
Mon Sep 22 14:11:08 UTC 2014


On 09/19/2014 09:13 PM, Zane Bitter wrote:
> SQS offers very, very limited guarantees, and it's clear that the reason
> for that is to make it massively, massively scalable in the way that
> e.g. S3 is scalable while also remaining comparably durable (S3 is
> supposedly designed for 11 nines, BTW).
>
> Zaqar, meanwhile, seems to be promising the world in terms of
> guarantees. (And then taking it away in the fine print, where it says
> that the operator can disregard many of them, potentially without the
> user's knowledge.)
>
> On the other hand, IIUC Zaqar does in fact have a sharding feature
> ("Pools") which is its answer to the massive scaling question.

There are different dimensions to the scaling problem.

As I understand it, pools don't help scaling a given queue since all the 
messages for that queue must be in the same pool. At present traffic 
through different Zaqar queues are essentially entirely orthogonal 
streams. Pooling can help scale the number of such orthogonal streams, 
but to be honest, that's the easier part of the problem.

There is also the possibility of using the sharding capabilities of the 
underlying storage. But the pattern of use will determine how effective 
that can be.

So for example, on the ordering question, if order is defined by a 
single sequence number held in the database and atomically incremented 
for every message published, that is not likely to be something where 
the databases sharding is going to help in scaling the number of 
concurrent publications.

Though sharding would allow scaling the total number messages on the 
queue (by distributing them over multiple shards), the total ordering of 
those messages reduces it's effectiveness in scaling the number of 
concurrent getters (e.g. the concurrent subscribers in pub-sub) since 
they will all be getting the messages in exactly the same order.

Strict ordering impacts the competing consumers case also (and is in my 
opinion of limited value as a guarantee anyway). At any given time, the 
head of the queue is in one shard, and all concurrent claim requests 
will contend for messages in that same shard. Though the unsuccessful 
claimants may then move to another shard as the head moves, they will 
all again try to access the messages in the same order.

So if Zaqar's goal is to scale the number of orthogonal queues, and the 
number of messages held at any time within these, the pooling facility 
and any sharding capability in the underlying store for a pool would 
likely be effective even with the strict ordering guarantee.

If scaling the number of communicants on a given communication channel 
is a goal however, then strict ordering may hamper that. If it does, it 
seems to me that this is not just a policy tweak on the underlying 
datastore to choose the desired balance between ordering and scale, but 
a more fundamental question on the internal structure of the queue 
implementation built on top of the datastore.

I also get the impression, perhaps wrongly, that providing the strict 
ordering guarantee wasn't necessarily an explicit requirement, but was 
simply a property of the underlying implementation(?).



More information about the OpenStack-dev mailing list