[openstack-dev] [oslo] ack(), reject() and requeue() support in rpc ...
Sandy Walsh
sandy.walsh at rackspace.com
Fri Aug 16 13:46:55 UTC 2013
On 08/16/2013 09:47 AM, Flavio Percoco wrote:
> On 14/08/13 17:08 -0300, Sandy Walsh wrote:
>> At Eric's request in https://review.openstack.org/#/c/41979/ I'm
>> bringing this to the ML for feedback.
>>
>> Currently, oslo-common rpc behaviour is to always ack() a message no
>> matter what.
>>
> Hey,
>
> I don't think we should keep adding new features to Oslo's rpc, I'd
> rather think how this fits into oslo.messaging.
Read on ... I think we'll face the same issues in messaging.
There is an alternative, which was my first approach. In StackTach, we
wrote our own notification consumption layer, which dealt with the
ack()/requeue() stuff directly. But, understandably, this got push back
when we attempted it in CM as the opinion was it belongs in olso. The
argument makes sense ... code duplication, would only support amqp,
reinventing the wheel, etc. The motivation was the very discussion we're
having now :)
>
>> For billing purposes we can't afford to drop important notifications
>> (like *.exists). We only want to ack() if no errors are raised by the
>> consumer, otherwise we want to requeue the message.
>>
>> Now, once we introduce this functionality, we will also need to support
>> .reject() semantics.
>>
>> The use-case we've seen for this is:
>> 1. grab notification
>> 2. write to disk
>> 3. do some processing on that notification, which raises an exception.
>> 4. the event is requeued and steps 2-3 repeat very quickly. Lots of
>> duplicate records. In our case we've blown out our database.
>
> Although I see some benefits from abstracting this, I'm not sure
> whether we *really* need this in Oslo messaging. My main concern is
> that acknowledgement is not supported by all back-ends and this can
> turn out being a design flaw for apps depending on methods like ack()
> / reject().
>From what I've been researching on zeromq, the consensus seems to be
"zeromq is very fast, but if you want it to be reliable you have to code
it all yourself."
We can't afford to drop billable events. That's the entire purpose of
having our notification system. So, I'm all ears for other suggestions.
> Have you guys thought about re-sending the failed message on a
> different topic / queue?
Pie/cake This essentially requeue() :)
Like I mentioned above, it's understood that for reliability under
zeromq, impl_zeromq.py will need to handle ack/reject/requeue semantics
manually. When the time comes for CM to support ZMQ, I'm guessing we'll
have to be the ones to add that code.
Here's the salient point: For normal rpc, no one will ever see it or
have access to it. If people are calling join_consumer_pool(...,
ack_on_error=False) themselves, they have to assume all risk.
That's the only way for a developer to get this requeue()/reject()
behaviour.
-S
> This is what Celery does to retry tasks on failures, for example.
>
>
> FF
>
More information about the OpenStack-dev
mailing list