[openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches

Boris Pavlovic bpavlovic at mirantis.com
Sat Dec 21 09:51:57 UTC 2013


Jay,

The session object will be commit()'d after the session.begin() context
manager exits, which will cause the aforementioned BEGIN; INSERT; COMMIT;
transaction to be executed against the server for each event record.

It is a just half of problem. We should use SQL Bulk Inserts as well.
And we should mention that declarative_base won't make this work for us
transparent. (Even if we make all in one transactions there will be N
Inserts).


Best regards,
Boris Pavlovic



On Sat, Dec 21, 2013 at 10:57 AM, Jay Pipes <jaypipes at gmail.com> wrote:

> On 12/20/2013 04:43 PM, Julien Danjou wrote:
>
>> On Fri, Dec 20 2013, Herndon, John Luke wrote:
>>
>>  I think there is probably a tolerance for duplicates but you’re right,
>>> missing a notification is unacceptable. Can anyone weigh in on how big
>>> of a
>>> deal duplicates are for meters? Duplicates aren’t really unique to the
>>> batching approach, though. If a consumer dies after it’s inserted a
>>> message
>>> into the data store but before the message is acked, the message will be
>>> requeued and handled by another consumer resulting in a duplicate.
>>>
>>
>> Duplicates can be a problem for metering, as if you see twice the same
>> event it's possible you will think it happened twice.
>>
>> As for event storage, it won't be a problem if you use a good storage
>> driver that can have unique constraint; you'll just drop it and log the
>> fact that this should not have happened, or something like that.
>>
>
> The above brings up a point related to the implementation of the existing
> SQL driver code that will need to be re-thought with the introduction of
> batch notification processing.
>
> Currently, the SQL driver's record_events() method [1] is written in a way
> that forces a new INSERT transaction for every record supplied to the
> method. If the record_events() method is called with 10K events, then 10K
> BEGIN; INSERT ...; COMMIT; transactions are executed against the server.
>
> Suffice to say, this isn't efficient. :)
>
> Ostensibly, from looking at the code, the reason that this approach was
> taken was to allow for the collection of duplicate event IDs, and return
> those duplicate event IDs to the caller.
>
> Because of this code:
>
>         for event_model in event_models:
>             event = None
>             try:
>                 with session.begin():
>                     event = self._record_event(session, event_model)
>             except dbexc.DBDuplicateEntry:
>                 problem_events.append((api_models.Event.DUPLICATE,
>                                        event_model))
>
> The session object will be commit()'d after the session.begin() context
> manager exits, which will cause the aforementioned BEGIN; INSERT; COMMIT;
> transaction to be executed against the server for each event record.
>
> If we want to actually take advantage of the performance benefits of
> batching notification messages, the above code will need to be rewritten so
> that a single transaction is executed against the database for the entire
> batch of events.
>
> Best,
> -jay
>
> [1] https://github.com/openstack/ceilometer/blob/master/
> ceilometer/storage/impl_sqlalchemy.py#L932
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131221/fdfe93f4/attachment.html>


More information about the OpenStack-dev mailing list