[openstack-dev] [Heat] Using Job Queues for timeout ops
zbitter at redhat.com
Mon Dec 1 21:05:42 UTC 2014
On 13/11/14 13:59, Clint Byrum wrote:
> I'm not sure we have the same understanding of AMQP, so hopefully we can
> clarify here. This stackoverflow answer echoes my understanding:
> Not ack'ing just means they might get retransmitted if we never ack. It
> doesn't block other consumers. And as the link above quotes from the
> AMQP spec, when there are multiple consumers, FIFO is not guaranteed.
> Other consumers get other messages.
Thanks, obviously my recollection of how AMQP works was coloured too
much by oslo.messaging.
> So just add the ability for a consumer to read, work, ack to
> oslo.messaging, and this is mostly handled via AMQP. Of course that
> also likely means no zeromq for Heat without accepting that messages
> may be lost if workers die.
> Basically we need to add something that is not "RPC" but instead
> "jobqueue" that mimics this:
> I've always been suspicious of this bit of code, as it basically means
> that if anything fails between that call, and the one below it, we have
> lost contact, but as long as clients are written to re-send when there
> is a lack of reply, there shouldn't be a problem. But, for a job queue,
> there is no reply, and so the worker would dispatch, and then
> acknowledge after the dispatched call had returned (including having
> completed the step where new messages are added to the queue for any
> newly-possible children).
I'm curious how people are deploying Rabbit at the moment. Are they
setting up multiple brokers and writing messages to disk before
accepting them? I assume yes on the former but no on the latter, since
there's no particular point in having e.g. 5 nines durability in the
queue when the overall system is as weak as your flakiest node.
OTOH if we were to add what you're proposing, then we would need folks
to deploy Rabbit that way (at least for Heat), since waiting for Acks on
receipt is insufficient to make messaging reliable if the broker can
easily outright lose the message.
I think all of the proposed approaches would benefit from this feature,
but I'm concerned about any increased burden on deployers too.
More information about the OpenStack-dev