[openstack-dev] [Nova] Handling soft delete for instance rows in a new cells database
sross at redhat.com
Tue Nov 25 16:54:40 UTC 2014
> I can't comment on other projects, but Nova definitely needs the soft
> delete in the main nova database. Perhaps not for every table, but
> there is definitely code in the code base which uses it right now.
> Search for read_deleted=True if you're curious.
Just to save people a bit of time, it's actually `read_deleted='yes'`
or `read_deleted="yes"` for many cases.
Just to give people a quick overview:
A cursory glance (no pun intended) seems to indicate that quite a few of
these are reading potentially deleted flavors. For this case, it makes
sense to keep things in one table (as we do).
There are also quite a few that seem to be making sure deleted "things"
are properly cleaned up. In this case, 'deleted' acts as a "CLEANUP"
state, so it makes just as much sense to keep the deleted rows in a
> For this case in particular, the concern is that operators might need
> to find where an instance was running once it is deleted to be able to
> diagnose issues reported by users. I think that's a valid use case of
> this particular data.
> >> This is a new database, so its our big chance to get this right. So,
> >> ideas welcome...
> >> Some initial proposals:
> >> - we do what we do in the current nova database -- we have a deleted
> >> column, and we set it to true when we delete the instance.
> >> - we have shadow tables and we move delete rows to a shadow table.
> > Both approaches are viable, but as the soft-delete column is widespread, it
> > would be thorny for this new app to use some totally different scheme,
> > unless the notion is that all schemes should move to the audit table
> > approach (which I wouldn’t mind, but it would be a big job). FTR, the
> > audit table approach is usually what I prefer for greenfield development,
> > if all that’s needed is forensic capabilities at the database inspection
> > level, and not as much active GUI-based “deleted” flags. That is, if you
> > really don’t need to query the history tables very often except when
> > debugging an issue offline. The reason its preferable is because those
> > rows are still “deleted” from your main table, and they don’t get in the
> > way of querying. But if you need to refer to these history rows in
> > context of the application, that means you need to get them mapped in such
> > a way that they behave like the primary rows, which overall is a more
> > difficult approach than just using the soft delete column.
I think it does really come down here to how you intend to use the soft-delete
functionality in Cells. If you just are using it to debug or audit, then I think
the right way to go would be either the audit table (potentially can store more
lifecycle data, but could end up taking up more space) or a separate shadow
table (takes up less space).
If you are going to use the soft delete for application functionality, I would
consider differentiating between "deleted" and "we still have things left to
clean up", since this seems to be mixing two different requirements into one.
> > That said, I have a lot of plans to send improvements down the way of the
> > existing approach of “soft delete column” into projects, from the querying
> > POV, so that criteria to filter out soft delete can be done in a much more
> > robust fashion (see
> > https://bitbucket.org/zzzeek/sqlalchemy/issue/3225/query-heuristic-inspector-event).
> > But this is still more complex and less performant than if the rows are
> > just gone totally, off in a history table somewhere (again, provided you
> > really don’t need to look at those history rows in an application context,
> > otherwise it gets all complicated again).
> Interesting. I hadn't seen consistency between the two databases as
> trumping doing this less horribly, but it sounds like its more of a
> thing that I thought.
> Rackspace Australia
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
More information about the OpenStack-dev