[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
geekinutah at gmail.com
Tue Mar 11 17:24:28 UTC 2014
Undeleting things is an important use case in my opinion. We do this in our
environment on a regular basis. In that light I'm not sure that it would be
appropriate just to log the deletion and git rid of the row. I would like
to see it go to an archival table where it is easily restored.
On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow <harlowja at yahoo-inc.com>wrote:
> Sounds like a good idea to me.
> I've never understood why we treat the DB as a LOG (keeping deleted == 0
> records around) when we should just use a LOG (or similar system) to begin
> with instead.
> Does anyone use the feature of switching deleted == 1 back to deleted =
> 0? Has this worked out for u?
> Seems like some of the feedback on
> https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests
> that this has been a operational pain-point for folks (Tool to delete
> things properly suggestions and such...).
> From: Boris Pavlovic <bpavlovic at mirantis.com>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev at lists.openstack.org>
> Date: Monday, March 10, 2014 at 1:29 PM
> To: OpenStack Development Mailing List <openstack-dev at lists.openstack.org>,
> Victor Sergeyev <vsergeyev at mirantis.com>
> Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft
> deletion (step by step)
> Hi stackers,
> (It's proposal for Juno.)
> Soft deletion means that records from DB are not actually deleted, they
> are just marked as a "deleted". To mark record as a "deleted" we put in
> special table's column "deleted" record's ID value.
> Issue 1: Indexes & Queries
> We have to add in every query "AND deleted == 0" to get non-deleted
> It produce performance issue, cause we should add it in any index one
> "extra" column.
> As well it produce extra complexity in db migrations and building queries.
> Issue 2: Unique constraints
> Why we store ID in deleted and not True/False?
> The reason is that we would like to be able to create real DB unique
> constraints and avoid race conditions on "insert" operation.
> Sample: we Have table (id, name, password, deleted) we would like to put
> in column "name" only unique value.
> Approach without UC: if count(`select .... where name = name`) == 0:
> (race cause we are able to add new record between )
> Approach with UC: try: insert(...) except Duplicate: ...
> So to add UC we have to add them on (name, deleted). (to be able to make
> insert/delete/insert with same name)
> As well it produce performance issues, because we have to use Complex
> unique constraints on 2 or more columns. + extra code & complexity in db
> Issue 3: Garbage collector
> It is really hard to make garbage collector that will have good
> performance and be enough common to work in any case for any project.
> Without garbage collector DevOps have to cleanup records by hand, (risk to
> break something). If they don't cleanup DB they will get very soon
> performance issue.
> To put in a nutshell most important issues:
> 1) Extra complexity to each select query & extra column in each index
> 2) Extra column in each Unique Constraint (worse performance)
> 3) 2 Extra column in each table: (deleted, deleted_at)
> 4) Common garbage collector is required
> To resolve all these issues we should just remove soft deletion.
> One of approaches that I see is in step by step removing "deleted"
> column from every table with probably code refactoring. Actually we have 3
> different cases:
> 1) We don't use soft deleted records:
> 1.1) Do .delete() instead of .soft_delete()
> 1.2) Change query to avoid adding extra "deleted == 0" to each query
> 1.3) Drop "deleted" and "deleted_at" columns
> 2) We use soft deleted records for internal stuff "e.g. periodic tasks"
> 2.1) Refactor code somehow: E.g. store all required data by periodic task
> in some special table that has: (id, type, json_data) columns
> 2.2) On delete add record to this table
> 2.3-5) similar to 1.1, 1.2, 13
> 3) We use soft deleted records in API
> 3.1) Deprecated API call if it is possible
> 3.2) Make proxy call to ceilometer from API
> 3.3) On .delete() store info about records in (ceilometer, or somewhere
> 3.4-6) similar to 1.1, 1.2, 1.3
> This is not ready RoadMap, just base thoughts to start the constructive
> discussion in the mailing list, so %stacker% your opinion is very
> Best regards,
> Boris Pavlovic
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev