[openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches

Herndon, John Luke john.herndon at hp.com
Fri Dec 20 21:41:59 UTC 2013


On Dec 20, 2013, at 1:12 PM, Dan Dyer <dan.dyer00 at gmail.com> wrote:

> On 12/20/2013 11:18 AM, Herndon, John Luke wrote:
>> On Dec 20, 2013, at 10:47 AM, Julien Danjou <julien at danjou.info> wrote:
>> 
>>> On Fri, Dec 20 2013, Herndon, John Luke wrote:
>>> 
>>>> Yeah, I like this idea. As far as I can tell, AMQP doesn’t support grabbing
>>>> more than a single message at a time, but we could definitely have the
>>>> broker store up the batch before sending it along. Other protocols may
>>>> support bulk consumption. My one concern with this approach is error
>>>> handling. Currently the executors treat each notification individually. So
>>>> let’s say the broker hands 100 messages at a time. When client is done
>>>> processing the messages, the broker needs to know if message 25 had an error
>>>> or not. We would somehow need to communicate back to the broker which
>>>> messages failed. I think this may take some refactoring of
>>>> executors/dispatchers. What do you think?
>>> Yeah, it definitely needs to change the messaging API a bit to handle
>>> such a case. But in the end that will be a good thing to support such a
>>> case, it being natively supported by the broker or not.
>>> 
>>> For brokers where it's not possible, it may be simple enough to have a
>>> "get_one_notification_nb()" method that would either return a
>>> notification or None if there's none to read, and would that
>>> consequently have to be _non-blocking_.
>>> 
>>> So if the transport is smart we write:
>>> 
>>>  # Return up to max_number_of_notifications_to_read
>>>  notifications =
>>>      transport.get_notificatations(conf.max_number_of_notifications_to_read)
>>>  storage.record(notifications)
>>> 
>>> Otherwise we do:
>>> 
>>>  for i in range(conf.max_number_of_notifications_to_read):
>>>      notification = transport.get_one_notification_nb():
>>>      if notification:
>>>          notifications.append(notification)
>>>      else:
>>>          break
>>>   storage.record(notifications)
>>> 
>>> So it's just about having the right primitive in oslo.messaging, we can
>>> then build on top of that wherever that is.
>>> 
>> I think this will work. I was considering putting in a timeout so the broker would not send off all of the messages immediately, and implement using blocking calls. If the consumer consumes faster than the publishers are publishing, this just becomes single-notification batches. So it may be beneficial to wait for more messages to arrive before sending off the batch. If the batch is full before the timeout is reached, then the batch would be sent off.
>> 
>>> -- 
>>> Julien Danjou
>>> /* Free Software hacker * independent consultant
>>>   http://julien.danjou.info */
>> -----------------
>> John Herndon
>> HP Cloud
>> john.herndon at hp.com
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> A couple of things that I think need to be emphasized here:
> 1. the mechanism needs to be configurable, so if you are more worried about reliability than performance you would be able to turn off bulk loading
Definitely will be configurable, but I don’t think batching is going to be any less reliable than individual inserts. Can you expand on what is concerning?
> 2. the caching size should also be configurable, so that we can limit your exposure to lost messages
Agreed.
> 3. while you can have the message queue hold the messages until you acknowledge them, it seems like this adds a lot of complexity to the interaction. you will need to be able to propagate this information all the way back from the storage driver.
This is actually a pretty standard use case for AMQP, we have done it several times on in-house projects. The basic.ack call lets you acknowledge a whole batch of messages at once. Yes, we do have to figure out how to propagate the error cases back up to the broker, but I don’t think it will be so complicated that it’s not worth doing.
> 4. any integration that is depdendent on a specific configuration on the rabbit server is brittle, since we have seen a lot of variation between services on this. I would prefer to control the behavior on the collection side
Hm, I don’t understand…?
> So in general, I would prefer a mechanism that pulls the data in a default manner, caches on the collection side based on configuration that allows you to determine your own risk level and then manager retries in the storage driver or at the cache controller level.
If you’re caching on the collector and the collector dies, then you’ve lost the whole batch of messages.  Then you have to invent some way of persisting the messages to disk until they been committed to the db and removing them afterwards. We originally talked about implementing a batching layer in the storage driver, but dragondm pointed out that the message queue is already hanging on to the messages and ensuring delivery, so it’s better to not reinvent that piece of the pipeline. This is a huge motivating factor for pursuing batching in oslo in my opinion.

> 
> Dan Dyer
> HP cloud
> dan.dyer at hp.com
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-----------------
John Herndon
HP Cloud
john.herndon at hp.com



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131220/f3ae6840/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4958 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131220/f3ae6840/attachment.bin>


More information about the OpenStack-dev mailing list