[openstack-dev] [all][db][performance] Proposal: Get rid of soft deletion (step by step)

Alexei Kornienko alexei.kornienko at gmail.com
Fri Mar 14 10:08:13 UTC 2014


On 03/14/2014 09:37 AM, Radomir Dopieralski wrote:
> Hello,
>
> I also think that this thread is going in the wrong direction, but I
> don't think the direction Boris wants is the correct one either. Frankly
> I'm a little surprised that nobody mentioned another advantage that soft
> delete gives us, the one that I think it was actually used for originally.
>
> You see, soft delete is an optimization. It's there to make the system
> work faster as a whole, have less code and be simpler to maintain and debug.
>
> How does it do it, when, as clearly shown in the first post in this
> thread, it makes the queries slower, requires additional indices in the
> database and more logic in the queries? The answer is, by doing more
> with those queries, by making you write less code, execute fewer queries
> to the databases and avoid duplicating the same data in multiple places.
>
> OpenStack is a big, distributed system of multiple databases that
> sometimes rely on each other and cross-reference their records. It's not
> uncommon to have some long-running operation started, that uses some
> data, and then, in the middle of its execution, have that data deleted.
> With soft delete, that's not a problem -- the operation can continue
> safely and proceed as scheduled, with the data it was started with in
> the first place -- it still has access to the deleted records as if
> nothing happened. You simply won't be able to schedule another operation
> like that with the same data, because it has been soft-deleted and won't
> pass the validation at the beginning (or even won't appear in the UI or
> CLI). This solves a lot of race conditions, error handling, additional
> checks to make sure the record still exists, etc.
1) Operation in SQL are working in transactions so deleted records will 
be visible for other clients until transaction commit.
2) If someone inside the same transaction will try to use record that is 
already deleted it's definitely an error in our code and should be fixed.
I don't think that such use case can be used as an argument to keep soft 
deleted records.
>
> Without soft delete, you need to write custom code every time to handle
> the case of a record being deleted mid-operation, including all the
> possible combinations of which record and when. Or you need to copy all
> the relevant data in advance over to whatever is executing that
> operation. This cannot be abstracted away entirely (although tools like
> TaskFlow help), as this is specific to the case you are handling. And
> it's not easy to find all the places where you can have a race condition
> like that -- especially when you are modifying existing code that has
> been relying on soft delete before. You can have bugs undetected for
> years, that only appear in production, on very large deployments, and
> are impossible to reproduce reliably.
>
> There are more similar cases like that, including cascading deletes and
> more advanced stuff, but I think this single case already shows that
> the advantages of soft delete out-weight its disadvantages.
>
> On 13/03/14 19:52, Boris Pavlovic wrote:
>> Hi all,
>>
>>
>> I would like to fix direction of this thread. Cause it is going in wrong
>> direction.
>>
>> To assume:
>> 1) Yes restoring already deleted recourses could be useful.
>> 2) Current approach with soft deletion is broken by design and we should
>> get rid of them.
>>
>> More about why I think that it is broken:
>> 1) When you are restoring some resource you should restore N records
>> from N tables (e.g. VM)
>> 2) Restoring sometimes means not only restoring DB records.
>> 3) Not all resources should be restorable (e.g. why I need to restore
>> fixed_ip? or key-pairs?)
>>
>>
>> So what we should think about is:
>> 1) How to implement restoring functionally in common way (e.g. framework
>> that will be in oslo)
>> 2) Split of work of getting rid of soft deletion in steps (that I
>> already mention):
>> a) remove soft deletion from places where we are not using it
>> b) replace internal code where we are using soft deletion to that framework
>> c) replace API stuff using ceilometer (for logs) or this framework (for
>> restorable stuff)
>>
>>
>> To put in a nutshell: Restoring Delete resources / Delayed Deletion !=
>> Soft deletion.
>>
>>
>> Best regards,
>> Boris Pavlovic
>>
>>
>>
>> On Thu, Mar 13, 2014 at 9:21 PM, Mike Wilson <geekinutah at gmail.com
>> <mailto:geekinutah at gmail.com>> wrote:
>>
>>      For some guests we use the LVM imagebackend and there are times when
>>      the guest is deleted on accident. Humans, being what they are, don't
>>      back up their files and don't take care of important data, so it is
>>      not uncommon to use lvrestore and "undelete" an instance so that
>>      people can get their data. Of course, this is not always possible if
>>      the data has been subsequently overwritten. But it is common enough
>>      that I imagine most of our operators are familiar with how to do it.
>>      So I guess my saying that we do it on a regular basis is not quite
>>      accurate. Probably would be better to say that it is not uncommon to
>>      do this, but definitely not a daily task or something of that ilk.
>>
>>      I have personally "undeleted" an instance a few times after
>>      accidental deletion also. I can't remember the specifics, but I do
>>      remember doing it :-).
>>
>>      -Mike
>>
>>
>>      On Tue, Mar 11, 2014 at 12:46 PM, Johannes Erdfelt
>>      <johannes at erdfelt.com <mailto:johannes at erdfelt.com>> wrote:
>>
>>          On Tue, Mar 11, 2014, Mike Wilson <geekinutah at gmail.com
>>          <mailto:geekinutah at gmail.com>> wrote:
>>          > Undeleting things is an important use case in my opinion. We
>>          do this in our
>>          > environment on a regular basis. In that light I'm not sure
>>          that it would be
>>          > appropriate just to log the deletion and git rid of the row. I
>>          would like
>>          > to see it go to an archival table where it is easily restored.
>>
>>          I'm curious, what are you undeleting and why?
>>
>>          JE
>>
>>
>>          _______________________________________________
>>          OpenStack-dev mailing list
>>          OpenStack-dev at lists.openstack.org
>>          <mailto:OpenStack-dev at lists.openstack.org>
>>          http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>      _______________________________________________
>>      OpenStack-dev mailing list
>>      OpenStack-dev at lists.openstack.org
>>      <mailto:OpenStack-dev at lists.openstack.org>
>>      http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list