[openstack-dev] [oslo.messaging] [mistral] Acknowledge feature of RabbitMQ in oslo.messaging

Joshua Harlow harlowja at outlook.com
Tue Jul 7 19:42:28 UTC 2015


Gordon Sim wrote:
> On 07/07/2015 05:48 PM, Clint Byrum wrote:
>> all of the call sites I checked _do not appear to resend_, they
>> simply explode on timeout waiting for reply. This is how calling code
>> should work and I'm ok with code in nova, cinder, et. al. being
>> written this way, because I'd expect my messaging layer to be at
>> least somewhat reliable
>
> In my opinion, the calling code has better context for determining
> whether or not to retry. Tackling reliability issues end-to-end is often
> much more efficient also.
>
> [...]
>> I think you'll find that once you try to make oslo.messaging handle the
>> retrying, that with the broker simply being ack'd all the time, you risk
>> duplicating RPC calls if you retry in a loop.
>
> Resending the request will always risk duplicating the call (unless the
> caller can verify that the previous request was not executed in some
> call specific way). Whether or not you acknowledge the request (and
> whether you do it before or after the processing of the request), the
> response can still get lost (neither requests nor responses are
> currently confirmed by the broker).
>
> There is a message id 'cache' used to try and detect (and then ignore)
> duplicates. It's not clear to me how effective that is in practice as it
> only tracks the last 16 ids for a given listener. In any case if the
> listener process is restarted, or if the call is redelivered to a
> different server in a group, then the id cache would be of no use.

The 16 ids stuff always makes me chuckle (at how its so weird/and IMHO 
useless); I remember that review, ha, 
https://review.openstack.org/#/c/20567/ (imho a past 16 ids list just 
hides the problem)... Maybe we can finally address the real problem here 
(projects not being able to handle duplicate messages without corrupting 
all the things...)

>
>> The pattern is well
>> established in RabbitMQ that acks should happen _AFTER_ the message has
>> been consumed and thus should not be duplicated, not before.
>
> That is the pattern for at-least-once delivery, where either processing
> is able to detect that a resent message was already processed or where
> reprocessing it is preferable to not processing it at all.
>
> I *believe* olso.messaging (or impl_rabbit at least) was aiming for an
> at-most-once guarantee (i.e. avoiding duplication at the expense of
> dropped messages). That may be why the acknowledgement is done before
> processing, though since the acknowledgement is asynchronous, that only
> narrows the window it doesn't eliminate it.
>
> I may of course be wrong. It would be great to have some one more
> qualified to comment on the intentions of the design provide some clarity.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list