[openstack-dev] [Heat] event table is a ticking time bomb
Clint Byrum
clint at fewbar.com
Fri Aug 9 15:56:31 UTC 2013
Excerpts from Sandy Walsh's message of 2013-08-09 06:16:55 -0700:
>
> On 08/08/2013 11:36 PM, Angus Salkeld wrote:
> > On 08/08/13 13:16 -0700, Clint Byrum wrote:
> >> Last night while reviewing a feature which would add more events to the
> >> event table, it dawned on me that the event table really must be removed.
> >
> >
> >>
> >> https://bugs.launchpad.net/heat/+bug/1209492
> >>
> >> tl;dr: users can write an infinite number of rows to the event table at
> >> a fairly alarming rate just by creating and updating a very large stack
> >> that has no resources that cost any time or are even billable (like an
> >> autoscaling launch configuration).
> >>
> >> The table has no purge function, so the only way to clear out old events
> >> is to delete the stack, or manually remove them directly in the database.
> >>
> >> We've all been through this before, logging to a database seems great
> >> until you actually do it.
> >>
> >> I have some ideas for how to solve it, but I wanted to get a wider
> >> audience:
> >>
> >> 1) Make the event list a ring buffer. Have rows 0 - $MAX_BUFFER_SIZE in
> >> each stack, and simply write each new event to the next open position,
> >> wrapping at $MAX_BUFFER_SIZE. Pros: little change to current code,
> >> just need an offset column added and code that will properly wrap to 0
> >> at $MAX_BUFFER_SIZE. Cons: still can incur heavy transactional load on
> >> the database server.A
> >>
> >> 1.b) Same, but instead of rows, just maintain a blob and append the rows
> >> as json list. Lowers transactional load but would push some load onto
> >> the API servers and such to parse these out, and would make pagination
> >> challenging. Blobs also can be a drain on DB server performance.
> >>
> >> 2) Write a purge script. Delete old ones. Pros: No code change, just
> >> new code to do purging. Cons: same as 1, plus more vulnerability to an
> >> aggressive attacker who can fit a lot of data in between purges. Also
> >> large scale deletes can be really painful (see: keystone sql token
> >> backend).
> >>
> >> 3) Log events to Swift. I can't seem to find information on how/if
> >> appending works there. Tons of tiny single-row files is an option, but I
> >> want to hear from people with more swift knowledge if that is a viable,
> >> performant option. Pros: Scale to the moon. Can charge tenant for usage
> >> and let them purge events as needed. Cons: Adds swift as a requirement
> >> of Heat.
> >>
> >> 4) Provide a way for users to receive logs via HTTP POST. Pros: Simple
> >> and punts the problem to the users. Cons: users will be SoL if they
> >> don't have a place to have logs posted to.
> >>
> >> 5) Provide a way for users to receive logs via messaging service like
> >> Marconi. Pros/Cons: same as HTTP, but perhaps a little more confusing
> >> and ambitious given Marconi's short existence.
> >>
> >> 6) Provide a pluggable backend for logging. This seems like the way most
> >> OpenStack projects solve these issues, which is to let the deployers
> >> choose and/or provide their own way to handle a sticky problem. Pros:
> >> Simple and flexible for the future. Cons: Would require writing at least
> >> one backend provider that does what the previous 5 options suggest.
> >>
> >> To be clear: Heat cannot really exist without this, as it is the only way
> >> to find out what your stack is doing or has done.
> >
> > btw Clint I have ditched that "Recorder" patch as Ceilometer is
> > getting a Alarm History api soon, so we can defer to that for that
> > functionality (alarm transitions).
> >
> > But we still need a better way to record events/logs for the user.
> > So I make this blueprint a while ago:
> > https://blueprints.launchpad.net/heat/+spec/user-visible-logs
> >
> > I am becomming more in favor of user options rather than deployer
> > options if possible. So provide resources for Marconi, Meniscus and
> > what ever...
> > Although what is nice about Marconi is you could then hook up what
> > ever you want to it.
>
> Logs are one thing (and Meniscus is a great choice for that), but events
> are the very thing CM is designed to handle. Wouldn't it make sense to
> push them back into there?
>
I'm not sure these events make sense in the current Ceilometer (I assume
that is "CM" above) context. These events are:
... Creating stack A
... Creating stack A resource A
... Created stack A resource A
... Created stack A
Users will want to be able to see all of the events for a stack, and
likely we need to be able to paginate through them as well.
They are fundamental and low level enough for Heat that I'm not sure
putting them in Ceilometer makes much sense, but maybe I don't understand
Ceilometer.. or "CM" is something else entirely. :)
More information about the OpenStack-dev
mailing list