[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
Joshua Harlow
harlowja at yahoo-inc.com
Fri Mar 14 18:39:14 UTC 2014
Off topic but, I'd like to see a word doc written out with the history of the cloud, that'd be pretty sweet.
Especially if its something like google docs where u can watch the changes happen in realtime.
+2
From: Jay Pipes <jaypipes at gmail.com<mailto:jaypipes at gmail.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Friday, March 14, 2014 at 7:55 AM
To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)
On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote:
Hello,
I also think that this thread is going in the wrong direction, but I
don't think the direction Boris wants is the correct one either. Frankly
I'm a little surprised that nobody mentioned another advantage that soft
delete gives us, the one that I think it was actually used for originally.
You see, soft delete is an optimization. It's there to make the system
work faster as a whole, have less code and be simpler to maintain and debug.
How does it do it, when, as clearly shown in the first post in this
thread, it makes the queries slower, requires additional indices in the
database and more logic in the queries?
I feel it isn't an optimization if:
* It slows down the code base
* Makes the code harder to read and understand
* Deliberately obscures the actions of removing and restoring resources
* Encourages the idea that everything in the system is "undoable", like
the cloud is a Word doc.
The answer is, by doing more
with those queries, by making you write less code, execute fewer queries
to the databases and avoid duplicating the same data in multiple places.
Fewer queries does not aklways make faster code, nor does it lead to
inherently race-free code.
OpenStack is a big, distributed system of multiple databases that
sometimes rely on each other and cross-reference their records. It's not
uncommon to have some long-running operation started, that uses some
data, and then, in the middle of its execution, have that data deleted.
With soft delete, that's not a problem -- the operation can continue
safely and proceed as scheduled, with the data it was started with in
the first place -- it still has access to the deleted records as if
nothing happened.
I believe a better solution would be to use Boris' solution and
implement safeguards around the delete operation. For instance, not
being able to delete an instance that has tasks still running against
it. Either that, or implement true task abortion logic that can notify
distributed components about the need to stop a running task because
either the user wants to delete a resource or simply cancel the
operation they began.
You simply won't be able to schedule another operation
like that with the same data, because it has been soft-deleted and won't
pass the validation at the beginning (or even won't appear in the UI or
CLI). This solves a lot of race conditions, error handling, additional
checks to make sure the record still exists, etc.
Sorry, I disagree here. Components that rely on the soft-delete behavior
to get the resource data from the database should instead respond to a
NotFound that gets raised by aborting their running task.
Without soft delete, you need to write custom code every time to handle
the case of a record being deleted mid-operation, including all the
possible combinations of which record and when.
Not custom code. Explicit code paths for explicit actions.
Or you need to copy all
the relevant data in advance over to whatever is executing that
operation.
This is already happening.
This cannot be abstracted away entirely (although tools like
TaskFlow help), as this is specific to the case you are handling. And
it's not easy to find all the places where you can have a race condition
like that -- especially when you are modifying existing code that has
been relying on soft delete before. You can have bugs undetected for
years, that only appear in production, on very large deployments, and
are impossible to reproduce reliably.
There are more similar cases like that, including cascading deletes and
more advanced stuff, but I think this single case already shows that
the advantages of soft delete out-weight its disadvantages.
I respectfully disagree :) I think the benefits of explicit code paths
and increased performance of the database outweigh the costs of changing
existing code.
Best,
-jay
On 13/03/14 19:52, Boris Pavlovic wrote:
> Hi all,
>
>
> I would like to fix direction of this thread. Cause it is going in wrong
> direction.
>
> To assume:
> 1) Yes restoring already deleted recourses could be useful.
> 2) Current approach with soft deletion is broken by design and we should
> get rid of them.
>
> More about why I think that it is broken:
> 1) When you are restoring some resource you should restore N records
> from N tables (e.g. VM)
> 2) Restoring sometimes means not only restoring DB records.
> 3) Not all resources should be restorable (e.g. why I need to restore
> fixed_ip? or key-pairs?)
>
>
> So what we should think about is:
> 1) How to implement restoring functionally in common way (e.g. framework
> that will be in oslo)
> 2) Split of work of getting rid of soft deletion in steps (that I
> already mention):
> a) remove soft deletion from places where we are not using it
> b) replace internal code where we are using soft deletion to that framework
> c) replace API stuff using ceilometer (for logs) or this framework (for
> restorable stuff)
>
>
> To put in a nutshell: Restoring Delete resources / Delayed Deletion !=
> Soft deletion.
>
>
> Best regards,
> Boris Pavlovic
>
>
>
> On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson <geekinutah at gmail.com<mailto:geekinutah at gmail.com>
> <mailto:geekinutah at gmail.com>> wrote:
>
> For some guests we use the LVM imagebackend and there are times when
> the guest is deleted on accident. Humans, being what they are, don't
> back up their files and don't take care of important data, so it is
> not uncommon to use lvrestore and "undelete" an instance so that
> people can get their data. Of course, this is not always possible if
> the data has been subsequently overwritten. But it is common enough
> that I imagine most of our operators are familiar with how to do it.
> So I guess my saying that we do it on a regular basis is not quite
> accurate. Probably would be better to say that it is not uncommon to
> do this, but definitely not a daily task or something of that ilk.
>
> I have personally "undeleted" an instance a few times after
> accidental deletion also. I can't remember the specifics, but I do
> remember doing it :-).
>
> -Mike
>
>
> On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt
> <johannes at erdfelt.com<mailto:johannes at erdfelt.com> <mailto:johannes at erdfelt.com>> wrote:
>
> On Tue, Mar 11, 2014, Mike Wilson <geekinutah at gmail.com<mailto:geekinutah at gmail.com>
> <mailto:geekinutah at gmail.com>> wrote:
> > Undeleting things is an important use case in my opinion. We
> do this in our
> > environment on a regular basis. In that light I'm not sure
> that it would be
> > appropriate just to log the deletion and git rid of the row. I
> would like
> > to see it go to an archival table where it is easily restored.
>
> I'm curious, what are you undeleting and why?
>
> JE
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140314/240e9e05/attachment.html>
More information about the OpenStack-dev
mailing list