[openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches

Herndon, John Luke john.herndon at hp.com
Fri Dec 20 21:26:48 UTC 2013


On Dec 20, 2013, at 12:13 PM, Gordon Sim <gsim at redhat.com> wrote:

> On 12/20/2013 05:27 PM, Herndon, John Luke wrote:
>> 
>> On Dec 20, 2013, at 8:48 AM, Julien Danjou <julien at danjou.info>
>> wrote:
>>> Anyway, my main concern here is that I am not very enthusiast
>>> about using the executor to do that. I wonder if there is not a way
>>> to ask the broker to get as many as message as it has up to a
>>> limit?
>>> 
>>> You would have 100 messages waiting in the notifications.info
>>> queue, and you would be able to tell to oslo.messaging that you
>>> want to read up to 10 messages at a time. If the underlying
>>> protocol (e.g. AMQP) can support that too, it would be more
>>> efficient too.
>> 
>> Yeah, I like this idea. As far as I can tell, AMQP doesn’t support
>> grabbing more than a single message at a time, but we could
>> definitely have the broker store up the batch before sending it
>> along.
> 
> AMQP (in all it's versions) allows for a subscription with a configurable amount of 'prefetch', which means the broker can send lots of messages without waiting for the client to request them one at a time.
> 
> That's not quite the same as the batching I think you are looking for, but it does allow the broker to do its own batching. My guess is the rabbit driver is already using basic.consume rather than basic.get anyway(?), so the broker is free to batch as it sees fit.  (I haven't actually dug into the kombu code to verify that however, perhaps someone else here can confirm?)
> 
Yeah, that should help out the performance a bit, but we will still need to work out the batching logic. I think basic.consume is likely the best way to go, I think it will be straight forward to implement the timeout mechanism I’m looking for in this case. Thanks for the tip :).

> However you still need the client to have some way of batching up the messages and then processing them together.
> 
>> Other protocols may support bulk consumption. My one concern
>> with this approach is error handling. Currently the executors treat
>> each notification individually. So let’s say the broker hands 100
>> messages at a time. When client is done processing the messages, the
>> broker needs to know if message 25 had an error or not. We would
>> somehow need to communicate back to the broker which messages failed.
>> I think this may take some refactoring of executors/dispatchers. What
>> do you think?
> 
> I've have some related questions, that I haven't yet satisfactorily answered yet. The extra context here may be useful in doing so.
> 
> (1) What are the expectations around message delivery guarantees for insertion into a store? I.e. if there is a failure, is it ok to get duplicate entries for notifications? (I'm assuming losing notifications is not acceptable).
I think there is probably a tolerance for duplicates but you’re right, missing a notification is unacceptable. Can anyone weigh in on how big of a deal duplicates are for meters? Duplicates aren’t really unique to the batching approach, though. If a consumer dies after it’s inserted a message into the data store but before the message is acked, the message will be requeued and handled by another consumer resulting in a duplicate. 

> (2) What would you want the broker to do with the failed messages? What sort of things might fail? Is it related to the message content itself? Or is it failures suspected to be of a temporal nature?
There will be situations where the message can’t be parsed, and those messages can’t just be thrown away. My current thought is that ceilometer could provide some sort of mechanism for sending messages that are invalid to an external data store (like a file, or a different topic on the amqp server) where a living, breathing human can look at them and try to parse out any meaningful information. Other errors might be “database not available”, in which case re-queing the message is probably the right way to go. If the consumer process crashes, all of the unasked messages need to be requeued and handled by a different consumer. Any other error cases?

> (3) How important is ordering ? If a failure causes some notifications to be inserted out of order is that a problem at all?
From an event point of view, I don’t think this is a problem since the events have a generated timestamp.

> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-----------------
John Herndon
HP Cloud
john.herndon at hp.com



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4958 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131220/1e4c65f8/attachment.bin>


More information about the OpenStack-dev mailing list