[openstack-dev] [Nova] Handling soft delete for instance rows in a new cells database

Andrew Laski andrew.laski at rackspace.com
Wed Nov 26 21:32:48 UTC 2014

On 11/26/2014 03:39 PM, Belmiro Moreira wrote:
> Hi,
> my experience is that "soft delete" is important to keep record of 
> deleted instances and its characteristics.
> In fact in my organization we are obliged to keep these records for 
> several months.
> However, it would be nice that after few months we were able to purge 
> the DB with a nova tool.

I think that any solution for this needs to keep the deleted data 
available in some form.  Is it important for you that the deleted data 
be in the same table as non deleted rows, or could they be moved into 
another table?  And would it matter if the format of the row changed 
during a move?

> In the particular case of this cells table my major concern is that 
> having a "delete" field maybe means that top and children databases 
> need to be synchronized. Looking into the current cells design having 
> duplicated information in different databases is one of the main issues.

Agreed.  I think this can be solved by ensuring that instance deletion 
is only about setting the deleted column in the cell instance table.  
The instance mapping being deleted makes no statement about whether or 
not the instance is deleted, though it would be a bug to delete it 
before the instance was deleted.

> Belmiro
> On Wed, Nov 26, 2014 at 4:40 PM, Andrew Laski 
> <andrew.laski at rackspace.com <mailto:andrew.laski at rackspace.com>> wrote:
>     On 11/25/2014 11:54 AM, Solly Ross wrote:
>             I can't comment on other projects, but Nova definitely
>             needs the soft
>             delete in the main nova database. Perhaps not for every
>             table, but
>             there is definitely code in the code base which uses it
>             right now.
>             Search for read_deleted=True if you're curious.
>         Just to save people a bit of time, it's actually
>         `read_deleted='yes'`
>         or `read_deleted="yes"` for many cases.
>         Just to give people a quick overview:
>         A cursory glance (no pun intended) seems to indicate that
>         quite a few of
>         these are reading potentially deleted flavors.  For this case,
>         it makes
>         sense to keep things in one table (as we do).
>         There are also quite a few that seem to be making sure deleted
>         "things"
>         are properly cleaned up.  In this case, 'deleted' acts as a
>         "CLEANUP"
>         state, so it makes just as much sense to keep the deleted rows
>         in a
>         separate table.
>             For this case in particular, the concern is that operators
>             might need
>             to find where an instance was running once it is deleted
>             to be able to
>             diagnose issues reported by users. I think that's a valid
>             use case of
>             this particular data.
>                     This is a new database, so its our big chance to
>                     get this right. So,
>                     ideas welcome...
>                     Some initial proposals:
>                     - we do what we do in the current nova database --
>                     we have a deleted
>                     column, and we set it to true when we delete the
>                     instance.
>                     - we have shadow tables and we move delete rows to
>                     a shadow table.
>                 Both approaches are viable, but as the soft-delete
>                 column is widespread, it
>                 would be thorny for this new app to use some totally
>                 different scheme,
>                 unless the notion is that all schemes should move to
>                 the audit table
>                 approach (which I wouldn’t mind, but it would be a big
>                 job).    FTR, the
>                 audit table approach is usually what I prefer for
>                 greenfield development,
>                 if all that’s needed is forensic capabilities at the
>                 database inspection
>                 level, and not as much active GUI-based “deleted”
>                 flags.   That is, if you
>                 really don’t need to query the history tables very
>                 often except when
>                 debugging an issue offline.  The reason its preferable
>                 is because those
>                 rows are still “deleted” from your main table, and
>                 they don’t get in the
>                 way of querying.   But if you need to refer to these
>                 history rows in
>                 context of the application, that means you need to get
>                 them mapped in such
>                 a way that they behave like the primary rows, which
>                 overall is a more
>                 difficult approach than just using the soft delete column.
>         I think it does really come down here to how you intend to use
>         the soft-delete
>         functionality in Cells.  If you just are using it to debug or
>         audit, then I think
>         the right way to go would be either the audit table
>         (potentially can store more
>         lifecycle data, but could end up taking up more space) or a
>         separate shadow
>         table (takes up less space).
>         If you are going to use the soft delete for application
>         functionality, I would
>         consider differentiating between "deleted" and "we still have
>         things left to
>         clean up", since this seems to be mixing two different
>         requirements into one.
>     The case that spawned this discussion is one where deleted rows
>     are not needed for application functionality.  So I'm going to
>     update the proposed schema there to not include a 'deleted'
>     column. Fortunately there's still some time before the question of
>     how to handle deletes needs to be fully sorted out.
>                 That said, I have a lot of plans to send improvements
>                 down the way of the
>                 existing approach of “soft delete column” into
>                 projects, from the querying
>                 POV, so that criteria to filter out soft delete can be
>                 done in a much more
>                 robust fashion (see
>                 https://bitbucket.org/zzzeek/sqlalchemy/issue/3225/query-heuristic-inspector-event).
>                 But this is still more complex and less performant
>                 than if the rows are
>                 just gone totally, off in a history table somewhere
>                 (again, provided you
>                 really don’t need to look at those history rows in an
>                 application context,
>                 otherwise it gets all complicated again).
>             Interesting. I hadn't seen consistency between the two
>             databases as
>             trumping doing this less horribly, but it sounds like its
>             more of a
>             thing that I thought.
>             Thanks,
>             Michael
>             --
>             Rackspace Australia
>             _______________________________________________
>             OpenStack-dev mailing list
>             OpenStack-dev at lists.openstack.org
>             <mailto:OpenStack-dev at lists.openstack.org>
>             http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>         _______________________________________________
>         OpenStack-dev mailing list
>         OpenStack-dev at lists.openstack.org
>         <mailto:OpenStack-dev at lists.openstack.org>
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     _______________________________________________
>     OpenStack-dev mailing list
>     OpenStack-dev at lists.openstack.org
>     <mailto:OpenStack-dev at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141126/097a2621/attachment.html>

More information about the OpenStack-dev mailing list