[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

Mike Wilson geekinutah at gmail.com
Fri Mar 14 18:25:15 UTC 2014


+1 to what Jay says here. This hidden behavior moistly just causes problems
and allows hacking hidden ways to restore things.

-Mike


On Fri, Mar 14, 2014 at 9:55 AM, Jay Pipes <jaypipes at gmail.com> wrote:

> On Fri, 2014-03-14 at 08:37 +0100, Radomir Dopieralski wrote:
> > Hello,
> >
> > I also think that this thread is going in the wrong direction, but I
> > don't think the direction Boris wants is the correct one either. Frankly
> > I'm a little surprised that nobody mentioned another advantage that soft
> > delete gives us, the one that I think it was actually used for
> originally.
> >
> > You see, soft delete is an optimization. It's there to make the system
> > work faster as a whole, have less code and be simpler to maintain and
> debug.
> >
> > How does it do it, when, as clearly shown in the first post in this
> > thread, it makes the queries slower, requires additional indices in the
> > database and more logic in the queries?
>
> I feel it isn't an optimization if:
>
> * It slows down the code base
> * Makes the code harder to read and understand
> * Deliberately obscures the actions of removing and restoring resources
> * Encourages the idea that everything in the system is "undoable", like
> the cloud is a Word doc.
>
> >  The answer is, by doing more
> > with those queries, by making you write less code, execute fewer queries
> > to the databases and avoid duplicating the same data in multiple places.
>
> Fewer queries does not aklways make faster code, nor does it lead to
> inherently race-free code.
>
> > OpenStack is a big, distributed system of multiple databases that
> > sometimes rely on each other and cross-reference their records. It's not
> > uncommon to have some long-running operation started, that uses some
> > data, and then, in the middle of its execution, have that data deleted.
> > With soft delete, that's not a problem -- the operation can continue
> > safely and proceed as scheduled, with the data it was started with in
> > the first place -- it still has access to the deleted records as if
> > nothing happened.
>
> I believe a better solution would be to use Boris' solution and
> implement safeguards around the delete operation. For instance, not
> being able to delete an instance that has tasks still running against
> it. Either that, or implement true task abortion logic that can notify
> distributed components about the need to stop a running task because
> either the user wants to delete a resource or simply cancel the
> operation they began.
>
> >  You simply won't be able to schedule another operation
> > like that with the same data, because it has been soft-deleted and won't
> > pass the validation at the beginning (or even won't appear in the UI or
> > CLI). This solves a lot of race conditions, error handling, additional
> > checks to make sure the record still exists, etc.
>
> Sorry, I disagree here. Components that rely on the soft-delete behavior
> to get the resource data from the database should instead respond to a
> NotFound that gets raised by aborting their running task.
>
> > Without soft delete, you need to write custom code every time to handle
> > the case of a record being deleted mid-operation, including all the
> > possible combinations of which record and when.
>
> Not custom code. Explicit code paths for explicit actions.
>
> >  Or you need to copy all
> > the relevant data in advance over to whatever is executing that
> > operation.
>
> This is already happening.
>
> > This cannot be abstracted away entirely (although tools like
> > TaskFlow help), as this is specific to the case you are handling. And
> > it's not easy to find all the places where you can have a race condition
> > like that -- especially when you are modifying existing code that has
> > been relying on soft delete before. You can have bugs undetected for
> > years, that only appear in production, on very large deployments, and
> > are impossible to reproduce reliably.
> >
> > There are more similar cases like that, including cascading deletes and
> > more advanced stuff, but I think this single case already shows that
> > the advantages of soft delete out-weight its disadvantages.
>
> I respectfully disagree :) I think the benefits of explicit code paths
> and increased performance of the database outweigh the costs of changing
> existing code.
>
> Best,
> -jay
>
> > On 13/03/14 19:52, Boris Pavlovic wrote:
> > > Hi all,
> > >
> > >
> > > I would like to fix direction of this thread. Cause it is going in
> wrong
> > > direction.
> > >
> > > To assume:
> > > 1) Yes restoring already deleted recourses could be useful.
> > > 2) Current approach with soft deletion is broken by design and we
> should
> > > get rid of them.
> > >
> > > More about why I think that it is broken:
> > > 1) When you are restoring some resource you should restore N records
> > > from N tables (e.g. VM)
> > > 2) Restoring sometimes means not only restoring DB records.
> > > 3) Not all resources should be restorable (e.g. why I need to restore
> > > fixed_ip? or key-pairs?)
> > >
> > >
> > > So what we should think about is:
> > > 1) How to implement restoring functionally in common way (e.g.
> framework
> > > that will be in oslo)
> > > 2) Split of work of getting rid of soft deletion in steps (that I
> > > already mention):
> > > a) remove soft deletion from places where we are not using it
> > > b) replace internal code where we are using soft deletion to that
> framework
> > > c) replace API stuff using ceilometer (for logs) or this framework (for
> > > restorable stuff)
> > >
> > >
> > > To put in a nutshell: Restoring Delete resources / Delayed Deletion !=
> > > Soft deletion.
> > >
> > >
> > > Best regards,
> > > Boris Pavlovic
> > >
> > >
> > >
> > > On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson <geekinutah at gmail.com
> > > <mailto:geekinutah at gmail.com>> wrote:
> > >
> > >     For some guests we use the LVM imagebackend and there are times
> when
> > >     the guest is deleted on accident. Humans, being what they are,
> don't
> > >     back up their files and don't take care of important data, so it is
> > >     not uncommon to use lvrestore and "undelete" an instance so that
> > >     people can get their data. Of course, this is not always possible
> if
> > >     the data has been subsequently overwritten. But it is common enough
> > >     that I imagine most of our operators are familiar with how to do
> it.
> > >     So I guess my saying that we do it on a regular basis is not quite
> > >     accurate. Probably would be better to say that it is not uncommon
> to
> > >     do this, but definitely not a daily task or something of that ilk.
> > >
> > >     I have personally "undeleted" an instance a few times after
> > >     accidental deletion also. I can't remember the specifics, but I do
> > >     remember doing it :-).
> > >
> > >     -Mike
> > >
> > >
> > >     On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt
> > >     <johannes at erdfelt.com <mailto:johannes at erdfelt.com>> wrote:
> > >
> > >         On Tue, Mar 11, 2014, Mike Wilson <geekinutah at gmail.com
> > >         <mailto:geekinutah at gmail.com>> wrote:
> > >         > Undeleting things is an important use case in my opinion. We
> > >         do this in our
> > >         > environment on a regular basis. In that light I'm not sure
> > >         that it would be
> > >         > appropriate just to log the deletion and git rid of the row.
> I
> > >         would like
> > >         > to see it go to an archival table where it is easily
> restored.
> > >
> > >         I'm curious, what are you undeleting and why?
> > >
> > >         JE
> > >
> > >
> > >         _______________________________________________
> > >         OpenStack-dev mailing list
> > >         OpenStack-dev at lists.openstack.org
> > >         <mailto:OpenStack-dev at lists.openstack.org>
> > >
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > >
> > >     _______________________________________________
> > >     OpenStack-dev mailing list
> > >     OpenStack-dev at lists.openstack.org
> > >     <mailto:OpenStack-dev at lists.openstack.org>
> > >     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > OpenStack-dev mailing list
> > > OpenStack-dev at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140314/617dbc63/attachment.html>


More information about the OpenStack-dev mailing list