[openstack-dev] [Nova] Handling soft delete for instance rows in a new cells database

Mike Bayer mbayer at redhat.com
Tue Nov 25 00:14:06 UTC 2014

> On Nov 24, 2014, at 5:20 PM, Michael Still <mikal at stillhq.com> wrote:
> Heya,
> Review https://review.openstack.org/#/c/135644/4 proposes the addition
> of a new database for our improved implementation of cells in Nova.
> However, there's an outstanding question about how to handle soft
> delete of rows -- we believe that we need to soft delete for forensic
> purposes.

Everytime I talk to people about the soft delete thing, I hear the usual refrain “we thought we needed it, but we didn’t and now it’s just overbuilt cruft we want to get rid of”.

Not saying you don’t have a need here but you definitely have this need, not just following the herd right?   Soft delete makes things a lot less convenient.

> This is a new database, so its our big chance to get this right. So,
> ideas welcome...
> Some initial proposals:
> - we do what we do in the current nova database -- we have a deleted
> column, and we set it to true when we delete the instance.
> - we have shadow tables and we move delete rows to a shadow table.

Both approaches are viable, but as the soft-delete column is widespread, it would be thorny for this new app to use some totally different scheme, unless the notion is that all schemes should move to the audit table approach (which I wouldn’t mind, but it would be a big job).    FTR, the audit table approach is usually what I prefer for greenfield development, if all that’s needed is forensic capabilities at the database inspection level, and not as much active GUI-based “deleted” flags.   That is, if you really don’t need to query the history tables very often except when debugging an issue offline.  The reason its preferable is because those rows are still “deleted” from your main table, and they don’t get in the way of querying.   But if you need to refer to these history rows in context of the application, that means you need to get them mapped in such a way that they behave like the primary rows, which overall is a more difficult approach than just using the soft delete column.

That said, I have a lot of plans to send improvements down the way of the existing approach of “soft delete column” into projects, from the querying POV, so that criteria to filter out soft delete can be done in a much more robust fashion (see https://bitbucket.org/zzzeek/sqlalchemy/issue/3225/query-heuristic-inspector-event).   But this is still more complex and less performant than if the rows are just gone totally, off in a history table somewhere (again, provided you really don’t need to look at those history rows in an application context, otherwise it gets all complicated again).

More information about the OpenStack-dev mailing list