[openstack-dev] [marconi] Removing Get and Delete Messages by ID

Janczuk, Tomasz tomasz.janczuk at hp.com
Mon Jun 2 18:42:42 UTC 2014


First of all, I think the removal of ³get message[s] by ID² is a great
change, it moves Marconi APIs closer to a typical messaging semantics.
However, I still see the ³list messages² API in the spec
(https://wiki.openstack.org/wiki/Marconi/specs/api/v1.1#List_Messages). Is
it the plan to leave this API in, or is it also going to be removed? If
the motivation for removing the ³get message[s] by ID² was to make it
easier to support different store backends (e.g. AMQP), I would expect the
same argument to apply to the ³list messages² API which allows random
access to messages in a queue.

Regarding deleting claimed messages, I think it should be possible to
claim multiple messages but then delete any of them individually. For
reference, that is the semantics of a ³batch claim² that SQS, Azure, and
IronMQ have - during batch claim a number of messages can be claimed, but
each of them is assigned a unique ³token² than can later be used to delete
just that message. I believe there is a good reason it is organized that
way: even if a batch of messages is claimed, their processing can be
completely unrelated, and executed in different time frames. It does not
make sense to make logical completion of message A conditional on the
success of processing of message B within a batch. It also does not make
sense to hold up completion of all messages in a batch until the
completion of the message that takes most time to process.

Put it another way, a batch is merely an optimization over the atomic
operation of claiming and deleting a single message. The optimization can
allow multiple messages to be claimed at once; it can also allow multiple
messages to be deleted at once (I believe AMQP has that semantics); but it
should not prevent the basic use case of claiming or deleting of a single
message. 

On 5/29/14, 1:22 AM, "Flavio Percoco" <flavio at redhat.com> wrote:

>On 28/05/14 17:01 +0000, Kurt Griffiths wrote:
>>Crew, as discussed in the last team meeting, I have updated the API v1.1
>>spec
>>to remove the ability to get one or more messages by ID. This was done to
>>remove unnecessary complexity from the API, and to make it easier to
>>support
>>different types of message store backends.
>>
>>However, this now leaves us with asymmetric semantics. On the one hand,
>>we do
>>not allow retrieving messages by ID, but we still support deleting them
>>by ID.
>>It seems to me that deleting a message only makes sense in the context
>>of a
>>claim or pop operation. In the case of a pop, the message is already
>>deleted by
>>the time the client receives it, so I don¹t see a need for including a
>>message
>>ID in the response. When claiming a batch of messages, however, the
>>client
>>still needs some way to delete each message after processing it. In this
>>case,
>>we either need to allow the client to delete an entire batch of messages
>>using
>>the claim ID, or we still need individual message IDs (hrefs) that can be
>>DELETEd. 
>>
>>Deleting a batch of messages can be accomplished in V1.0 using ³delete
>>multiple
>>messages by ID². Regardless of how it is done, I¹ve been wondering if it
>>is
>>actually an anti-pattern; if a worker crashes after processing N
>>messages, but
>>before deleting those same N messages, the system is left with several
>>messages
>>that another worker will pick up and potentially reprocess, although the
>>work
>>has already been done. If the work is idempotent, this isn¹t a big deal.
>>Otherwise, the client will have to have a way to check whether a message
>>has
>>already been processed, ignoring it if it has. But whether it is 1
>>message or N
>>messages left in a bad state by the first worker, the other worker has to
>>follow the same logic, so perhaps it would make sense after all to
>>simply allow
>>deleting entire batches of claimed messages by claim ID, and not
>>worrying about
>>providing individual message hrefs/IDs for deletion.
>
>There are some risks related to claiming a set of messages and process
>them in batch rather than processing 1 message at a time. However,
>some of those risks are valid for both scenarios. For instance, if a
>worker claims just 1 message and dies before deleting it, the server
>will be left with an already processed message.
>
>I believe this is very specific to the each use-case. Based on their
>needs, users will have to choose between 'pop'ng' messages out of the
>queue or caliming them. One way to provide more info to the user is by
>keeping track of how many times (or even the last time) a message has
>been claimed. I'm not a big fan of this because it'll add more
>complexity and more importantly we won't be able to support this on
>the AMQP driver.
>
>It's common to have this kind of 'tolerance' implemented in the
>client-side. The server must guarantee the delivery mechanism whereas
>the client must be tolerant enough based on the use-case.
>
>>
>>With all this in mind, I¹m starting to wonder if I should revert my
>>changes to
>>the spec, and wait to address these changes in the v2.0 API, since it
>>seems
>>that to do this right, we need to make some changes that are anything but
>>³minor² (for a minor release).
>>
>>What does everyone think? Should we postpone this work to 2.0?
>
>I think this is quite a big change to have in a minor release. I vote
>for doing this in v2.0 and I'd also like us to put more thoughts on
>it. For example, accessing messages by-id seems to be an important
>feature in SQS. I'm not saying the decision must be based on that but
>since both Marconi's and SQS's targets are very similar, we should
>probably take a deeper look at the utility. Unfortunately, as
>mentioned above, it'll be hard to support this on the AMQP driver.
>
>Flavio
>
>-- 
>@flaper87
>Flavio Percoco
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list