[openstack-dev] [oslo.messaging] [mistral] Acknowledge feature of RabbitMQ in oslo.messaging
gsim at redhat.com
Tue Jul 7 19:16:19 UTC 2015
On 07/07/2015 05:48 PM, Clint Byrum wrote:
> all of the call sites I checked _do not appear to resend_, they
> simply explode on timeout waiting for reply. This is how calling code
> should work and I'm ok with code in nova, cinder, et. al. being
> written this way, because I'd expect my messaging layer to be at
> least somewhat reliable
In my opinion, the calling code has better context for determining
whether or not to retry. Tackling reliability issues end-to-end is often
much more efficient also.
> I think you'll find that once you try to make oslo.messaging handle the
> retrying, that with the broker simply being ack'd all the time, you risk
> duplicating RPC calls if you retry in a loop.
Resending the request will always risk duplicating the call (unless the
caller can verify that the previous request was not executed in some
call specific way). Whether or not you acknowledge the request (and
whether you do it before or after the processing of the request), the
response can still get lost (neither requests nor responses are
currently confirmed by the broker).
There is a message id 'cache' used to try and detect (and then ignore)
duplicates. It's not clear to me how effective that is in practice as it
only tracks the last 16 ids for a given listener. In any case if the
listener process is restarted, or if the call is redelivered to a
different server in a group, then the id cache would be of no use.
> The pattern is well
> established in RabbitMQ that acks should happen _AFTER_ the message has
> been consumed and thus should not be duplicated, not before.
That is the pattern for at-least-once delivery, where either processing
is able to detect that a resent message was already processed or where
reprocessing it is preferable to not processing it at all.
I *believe* olso.messaging (or impl_rabbit at least) was aiming for an
at-most-once guarantee (i.e. avoiding duplication at the expense of
dropped messages). That may be why the acknowledgement is done before
processing, though since the acknowledgement is asynchronous, that only
narrows the window it doesn't eliminate it.
I may of course be wrong. It would be great to have some one more
qualified to comment on the intentions of the design provide some clarity.
More information about the OpenStack-dev