<div dir="ltr">Undeleting things is an important use case in my opinion. We do this in our environment on a regular basis. In that light I'm not sure that it would be appropriate just to log the deletion and git rid of the row. I would like to see it go to an archival table where it is easily restored.<div>
<br></div><div>-Mike</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow <span dir="ltr"><<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
<div>Sounds like a good idea to me.</div>
<div><br>
</div>
<div>I've never understood why we treat the DB as a LOG (keeping deleted == 0 records around) when we should just use a LOG (or similar system) to begin with instead. </div>
<div><br>
</div>
<div>Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has this worked out for u?</div>
<div><br>
</div>
<div>Seems like some of the feedback on <a href="https://etherpad.openstack.org/p/operators-feedback-mar14" target="_blank">https://etherpad.openstack.org/p/operators-feedback-mar14</a> also suggests that this has been a operational pain-point for folks (Tool to delete things
properly suggestions and such…).</div>
<div><br>
</div>
<span>
<div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span>Boris Pavlovic <<a href="mailto:bpavlovic@mirantis.com" target="_blank">bpavlovic@mirantis.com</a>><br>
<span style="font-weight:bold">Reply-To: </span>"OpenStack Development Mailing List (not for usage questions)" <<a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Date: </span>Monday, March 10, 2014 at 1:29 PM<br>
<span style="font-weight:bold">To: </span>OpenStack Development Mailing List <<a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a>>, Victor Sergeyev <<a href="mailto:vsergeyev@mirantis.com" target="_blank">vsergeyev@mirantis.com</a>><br>
<span style="font-weight:bold">Subject: </span>[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)<br>
</div><div><div class="h5">
<div><br>
</div>
<blockquote style="BORDER-LEFT:#b5c4df 5 solid;PADDING:0 0 0 5;MARGIN:0 0 0 5">
<div>
<div>
<div dir="ltr">Hi stackers,
<div><br>
</div>
<div>(It's proposal for Juno.)</div>
<div><br>
</div>
<div>Intro:</div>
<div><br>
</div>
<div>Soft deletion means that records from DB are not actually deleted, they are just marked as a "deleted". To mark record as a "deleted" we put in special table's column "deleted" record's ID value. </div>
<div><br>
</div>
<div>Issue 1: Indexes & Queries</div>
<div>We have to add in every query "AND deleted == 0" to get non-deleted records.</div>
<div>It produce performance issue, cause we should add it in any index one "extra" column. </div>
<div>As well it produce extra complexity in db migrations and building queries. </div>
<div><br>
</div>
<div>Issue 2: Unique constraints </div>
<div>Why we store ID in deleted and not True/False? <br>
</div>
<div>The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on "insert" operation. </div>
<div><br>
</div>
<div>Sample: we Have table (id, name, password, deleted) we would like to put in column "name" only unique value. </div>
<div><br>
</div>
<div>Approach without UC: if count(`select .... where name = name`) == 0: insert(...)</div>
<div>(race cause we are able to add new record between )</div>
<div><br>
</div>
<div>Approach with UC: try: insert(...) except Duplicate: ...</div>
<div><br>
</div>
<div>So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name)</div>
<div><br>
</div>
<div>As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code & complexity in db migrations. </div>
<div><br>
</div>
<div>Issue 3: Garbage collector</div>
<div><br>
</div>
<div>It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. </div>
<div>Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. </div>
<div><br>
</div>
<div>To put in a nutshell most important issues:</div>
<div>1) Extra complexity to each select query & extra column in each index</div>
<div>2) Extra column in each Unique Constraint (worse performance)</div>
<div>3) 2 Extra column in each table: (deleted, deleted_at)</div>
<div>4) Common garbage collector is required</div>
<div><br>
</div>
<div><br>
</div>
<div>To resolve all these issues we should just remove soft deletion.</div>
<div><br>
</div>
<div>One of approaches that I see is in step by step removing "deleted" column from every table with probably code refactoring. Actually we have 3 different cases:</div>
<div><br>
</div>
<div>1) We don't use soft deleted records: </div>
<div>1.1) Do .delete() instead of .soft_delete()</div>
<div>1.2) Change query to avoid adding extra "deleted == 0" to each query</div>
<div>1.3) Drop "deleted" and "deleted_at" columns </div>
<div><br>
</div>
<div>2) We use soft deleted records for internal stuff "e.g. periodic tasks"</div>
<div>2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns</div>
<div>2.2) On delete add record to this table </div>
<div>2.3-5) similar to 1.1, 1.2, 13</div>
<div><br>
</div>
<div>3) We use soft deleted records in API </div>
<div>3.1) Deprecated API call if it is possible </div>
<div>3.2) Make proxy call to ceilometer from API </div>
<div>3.3) On .delete() store info about records in (ceilometer, or somewhere else) </div>
<div>3.4-6) similar to 1.1, 1.2, 1.3</div>
<div> </div>
<div>This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! </div>
<div><br>
</div>
<div><br>
</div>
<div>Best regards,</div>
<div>Boris Pavlovic </div>
</div>
</div>
</div>
</blockquote>
</div></div></span>
</div>
<br>_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div>