Open Stack

Tue Jul 7 19:37:05 UTC 2015

Gordon Sim wrote:
> On 07/07/2015 05:48 PM, Clint Byrum wrote:
>> all of the call sites I checked _do not appear to resend_, they
>> simply explode on timeout waiting for reply. This is how calling code
>> should work and I'm ok with code in nova, cinder, et. al. being
>> written this way, because I'd expect my messaging layer to be at
>> least somewhat reliable
>
> In my opinion, the calling code has better context for determining
> whether or not to retry. Tackling reliability issues end-to-end is often
> much more efficient also.
>
> [...]
>> I think you'll find that once you try to make oslo.messaging handle the
>> retrying, that with the broker simply being ack'd all the time, you risk
>> duplicating RPC calls if you retry in a loop.
>
> Resending the request will always risk duplicating the call (unless the
> caller can verify that the previous request was not executed in some
> call specific way). Whether or not you acknowledge the request (and
> whether you do it before or after the processing of the request), the
> response can still get lost (neither requests nor responses are
> currently confirmed by the broker).
>
> There is a message id 'cache' used to try and detect (and then ignore)
> duplicates. It's not clear to me how effective that is in practice as it
> only tracks the last 16 ids for a given listener. In any case if the
> listener process is restarted, or if the call is redelivered to a
> different server in a group, then the id cache would be of no use.
>
>> The pattern is well
>> established in RabbitMQ that acks should happen _AFTER_ the message has
>> been consumed and thus should not be duplicated, not before.
>
> That is the pattern for at-least-once delivery, where either processing
> is able to detect that a resent message was already processed or where
> reprocessing it is preferable to not processing it at all.
>
> I *believe* olso.messaging (or impl_rabbit at least) was aiming for an
> at-most-once guarantee (i.e. avoiding duplication at the expense of
> dropped messages). That may be why the acknowledgement is done before
> processing, though since the acknowledgement is asynchronous, that only
> narrows the window it doesn't eliminate it.
>
> I may of course be wrong. It would be great to have some one more
> qualified to comment on the intentions of the design provide some clarity.

 From history that I remember the above was done since most of the 
projects could not handle duplicated messages and would do things over 
and over again (further corrupting the world). This may now be better 
(?) as projects are able to correctly do state-transitions on resources 
and avoid corrupting the world if a message that says do X can now be 
ignored/dropped due to the state-transitions disallowing it (since it 
already happened).

I might not have the history right (and I can't seem to find the prior 
discussions) but I think that was the gist of it; others might be able 
to remember more than I can (this question has come up before I think, I 
may have even asked it).

>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Open Stack

[openstack-dev] [oslo.messaging] [mistral] Acknowledge feature of RabbitMQ in oslo.messaging

OpenStack

Community

Documentation

Branding & Legal