[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

Jay Pipes jaypipes at gmail.com
Mon Mar 10 21:58:53 UTC 2014


On Tue, 2014-03-11 at 01:29 +0400, Boris Pavlovic wrote:
> <snip>
> To put in a nutshell most important issues:
> 1) Extra complexity to each select query & extra column in each index
> 2) Extra column in each Unique Constraint (worse performance)
> 3) 2 Extra column in each table: (deleted, deleted_at)
> 4) Common garbage collector is required

Nice summary of the problems related to soft deletion, Boris.

> To resolve all these issues we should just remove soft deletion.
> 
> One of approaches that I see is in step by step removing "deleted"
> column from every table with probably code refactoring.  Actually we
> have 3 different cases:
> 
> 1) We don't use soft deleted records: 
> 1.1) Do .delete() instead of .soft_delete()
> 1.2) Change query to avoid adding extra "deleted == 0" to each query
> 1.3) Drop "deleted" and "deleted_at" columns
> 
> 2) We use soft deleted records for internal stuff "e.g. periodic
> tasks"
> 2.1) Refactor code somehow: E.g. store all required data by periodic
> task in some special table that has: (id, type, json_data) columns
> 2.2) On delete add record to this table 
> 2.3-5) similar to 1.1, 1.2, 13
> 
> 3) We use soft deleted records in API 
> 3.1) Deprecated API call if it is possible 
> 3.2) Make proxy call to ceilometer from API 
> 3.3) On .delete() store info about records in (ceilometer, or
> somewhere else) 
> 3.4-6) similar to 1.1, 1.2, 1.3

I would actually prefer this "solution", at least for server instances:

1. Remove any contractual obligation in the API to allow servers with
the same "name" to exist, as long as only one of those servers is not
deleted. As I've mentioned before, I think this is exceedingly silly to
slow down the operation of Nova just to allow a user to create a server,
delete it, and immediately create a server with the same name.

2. Make the unique constraint for the server name be on (project_id,
name) and be done with it.

3. Remove deleted and deleted_at from the instances table.

4. Don't allow any delete() operation at all on the
nova.objects.instance object at all.

3. Hard delete records from the instances table on a periodic basis
using an external archiver that either just deletes the records in
instances that are in ERROR or TERMINATED vm_state (as is possible if
ceilometer is providing your bookkeeping) or move those records into an
archival table (as would be necessary if you are not running Ceilometer
and need some history of these things).

For other objects in the system, I think your solution #1 would work
fine.

Best,
-jay





More information about the OpenStack-dev mailing list