Open Stack

Mon Jul 8 13:48:24 UTC 2013

On 07/08/2013 08:15 AM, Nikola Đipanov wrote:

> This is only true if you have one table with no relations that need to
> be considered.
>
> Here is an example of when it gets tricky - Say you have a table T1 and
> a migration that adds a column c1 that relies on some data from table T2
> and T1 has a FK that points to T2. And say for the sake of argument that
> objects that are represented by rows in T1 and T2 have different
> life-times in the system (think instances and devices, groups, quotas,
> networks... this is common in our data model).
>
> In order to properly migrate and assign values to the newly created c1
> you will need to:
>
> * Add the column c1 to the live T1
> * join on live T2 *and* shadow T2 to get the data needed and populate
> the new column.
> * Add the column c1 to the shadow T1
> * join on live T2 *and* shadow T2 to get the data needed and populate
> the new column.
>
> Hence - exponentially more joins, as I stated in my previous email.
>
> Now - this was the *simplest* possible example - things get potentially
> much more complicated if the new column relies on previous state of data
> (say - counters of some sort), if you need to get data from a third
> table (think many-to-many relationships) etc.
>
> If you need a real example - take a look at migration 186 in the current
> trunk.
>
> As I said in the previous email, and based on the examples above - this
> design decision (unconstrained rows) makes it difficult to reason about
> data in the system!
>
> I personally - as a developer working on the codebase - am not happy
> making this trade-off in favour of archiving in this way - and would
> like to see some design decisions changed, or at the very least a more
> broad consensus, that the state as-is is actually OK and we don't need
> to worry about it.

I agree that it's a mess.  However, the current archiving code just does 
the simplest thing possible -- move what can be moved to shadow tables, 
and try again later if foreign key constraints prevent that.  It's 
certainly possible to do something more clever, but that would require 
the archiving code to know more about all the other tables in the 
system, which sounds difficult to maintain.

Unfortunately, soft-deleted rows still satisfy FK constraints, so rows 
that point to soft-deleted rows never get deleted, so there is junk 
permanently left behind in some tables (like nova's instances). 
Soft-deletes let people get away with being sloppy, so people are sloppy.

What I'd really like to see is for the entire soft-delete idea to go 
away, and just delete rows when it's time to delete them.  Does anyone 
remember why soft-deletes got added in the first place?

-- 
David Ripton   Red Hat   dripton at redhat.com

Open Stack

[openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

OpenStack

Community

Documentation

Branding & Legal