[openstack-dev] [Heat] event table is a ticking time bomb

Clint Byrum clint at fewbar.com
Thu Aug 8 20:16:22 UTC 2013


Last night while reviewing a feature which would add more events to the
event table, it dawned on me that the event table really must be removed.

https://bugs.launchpad.net/heat/+bug/1209492

tl;dr: users can write an infinite number of rows to the event table at
a fairly alarming rate just by creating and updating a very large stack
that has no resources that cost any time or are even billable (like an
autoscaling launch configuration).

The table has no purge function, so the only way to clear out old events
is to delete the stack, or manually remove them directly in the database.

We've all been through this before, logging to a database seems great
until you actually do it.

I have some ideas for how to solve it, but I wanted to get a wider
audience:

1) Make the event list a ring buffer. Have rows 0 - $MAX_BUFFER_SIZE in
each stack, and simply write each new event to the next open position,
wrapping at $MAX_BUFFER_SIZE. Pros: little change to current code,
just need an offset column added and code that will properly wrap to 0
at $MAX_BUFFER_SIZE. Cons: still can incur heavy transactional load on
the database server.A

1.b) Same, but instead of rows, just maintain a blob and append the rows
as json list. Lowers transactional load but would push some load onto
the API servers and such to parse these out, and would make pagination
challenging. Blobs also can be a drain on DB server performance.

2) Write a purge script. Delete old ones. Pros: No code change, just
new code to do purging. Cons: same as 1, plus more vulnerability to an
aggressive attacker who can fit a lot of data in between purges. Also
large scale deletes can be really painful (see: keystone sql token
backend).

3) Log events to Swift. I can't seem to find information on how/if
appending works there. Tons of tiny single-row files is an option, but I
want to hear from people with more swift knowledge if that is a viable,
performant option. Pros: Scale to the moon. Can charge tenant for usage
and let them purge events as needed. Cons: Adds swift as a requirement
of Heat.

4) Provide a way for users to receive logs via HTTP POST. Pros: Simple
and punts the problem to the users. Cons: users will be SoL if they
don't have a place to have logs posted to.

5) Provide a way for users to receive logs via messaging service like
Marconi.  Pros/Cons: same as HTTP, but perhaps a little more confusing
and ambitious given Marconi's short existence.

6) Provide a pluggable backend for logging. This seems like the way most
OpenStack projects solve these issues, which is to let the deployers
choose and/or provide their own way to handle a sticky problem. Pros:
Simple and flexible for the future. Cons: Would require writing at least
one backend provider that does what the previous 5 options suggest.

To be clear: Heat cannot really exist without this, as it is the only way
to find out what your stack is doing or has done.



More information about the OpenStack-dev mailing list