[openstack-dev] [Zaqar] Zaqar and SQS Properties of Distributed Queues

Gordon Sim gsim at redhat.com
Wed Sep 24 14:33:37 UTC 2014


Apologies in advance for possible repetition and pedantry...

On 09/24/2014 02:48 AM, Devananda van der Veen wrote:
> 2. Single Delivery - each message must be processed *exactly* once
>    Example: Using a queue to process votes. Every vote must be counted only once.

It is also important to consider the ability of the publisher to 
reliable publish a message exactly once. If that can't be done, there 
may need to be de-duplication even if there is an exactly-once delivery 
guarantee of messages from the queue (because there could exist two 
copies of the same logical message).

> 5. Aggregate throughput
>    Example: Ad banner processing. Remember when sites could get
> slash-dotted? I need a queue resilient to truly massive spikes in
> traffic.

A massive spike in traffic can be handled also by allowing the queue to 
grow, rather than increasing the throughput. This is obviously only 
effective if it is indeed a spike and the rate of ingress drops again to 
allow the backlog to be processed.

So scaling up aggregate throughput is certainly an important requirement 
for some. However the example illustrates another, which is scaling the 
size of the queue (because the bottleneck for throughput may be in the 
application processing or this processing may be temporarily 
unavailable). The latter is something that both Zaqar and SQS I suspect 
would do quite well at.

> 6. FIFO - When ordering matters
>    Example: I can't "stop" a job that hasn't "start"ed yet.

I think FIFO is insufficiently precise.

The most extreme requirement is total ordering, i.e. all messages are 
assigned a place in a fixed sequence and the order in which they are 
seen is the same for all receivers.

The example you give above is really causal ordering. Since the need to 
stop a job is caused by the starting of that job, the stop request must 
come after the start request. However the ordering of the stop request 
for task A with respect to a stop request for task B may not be defined 
(e.g. if they are triggered concurrently).

The pattern in use is also relevant. For multiple competing consumers, 
if there are ordering requirements such as the one in your example, it 
is not sufficient to *deliver* the messages in order, they must also be 
*processed* in order.

If I have two consumers processing task requests, and give the 'start A' 
message to one, and then the 'stop A' message to another it is possible 
that the second, though dispatched by the messaging service after the 
first message, is still processed before it.

One way to avoid that would be to have the application use a separate 
queue for processing consumer, and ensure causally related messages are 
sent through the same queue. The downside is less adaptive load 
balancing and resiliency. Another option is to have the messaging 
service recognise message groupings and ensure that a group in which a 
previously delivered message has not been acknowledged are delivered 
only to the same consumer as that previous message.

[...]
> Zaqar relies on a store-and-forward architecture, which is not
> amenable to low-latency message processing (4).

I don't think store-and-forward precludes low-latency ('low' is of 
course subjective). Polling however is not a good fit for latency 
sensitive applications.

> Again, as with SQS, it is not a wire-level protocol,

It is a wire-level protocol, but as it is based on HTTP it doesn't 
support asynchronous delivery of messages from server to client at present.

> so I don't believe low-latency connectivity (3) was a design goal.

Agreed (and that is the important thing, so sorry for the nitpicking!).



More information about the OpenStack-dev mailing list