[openstack-dev] [nova] FKs in the DB

Mike Bayer mbayer at redhat.com
Fri Nov 20 19:29:31 UTC 2015



On 11/20/2015 11:19 AM, Alexis Lee wrote:
> We just had a fun discussion in IRC about whether foreign keys are evil.
> Initially I thought this was crazy but mordred made some good points. To
> paraphrase, that if you have a scale-out app already it's easier to
> manage integrity in your app than scale-out your persistence layer.

I've had this argument with mordred before, and it seems again there's
the same misunderstanding going on:

1. Your application can have **conceptual** foreign keys in it, without
actually having foreign keys **for real** in the database.  This means
your SQLAlchemy code still does ForeignKey, ForeignKeyConstraint, and
most importantly your **database still uses normal form**, that is, any
row that refers to another does it based on a set of columns that
exactly match to the primary key of a single table elsewhere (not to
multiple tables, not to a function of the columns cast from int to
string and concatenated to the value in the other table etc, an *exact
match*).   I'm sure that mordred agrees with all of these practices,
however when one says "we aren't going to use foreign keys anymore",
typically it is all these critical schema design practices that go out
the window.  Put another way, the foreign key concept not only
constrains data in a real database, just the concept of them constraints
the **developer** to use correct normal form.

2. Here's the part mordred doesn't like - the FK is actually in the
database for real.   This is because they slow down inserts, updates,
and deletes, because they must be checked.   To which I say, no such
performance issue has been observed or documented in Openstack, we
aren't a 1 million TPS financial system, so this is vastly premature
optimization.

Also as far as #2, customers and operators *regularly* run scripts and
queries to modify openstack databases, particularly to delete soft
deleted rows.  These get blocked *all the time* by foreign key
constraints.  They are doing their job, and they are needed as a final
guard against data integrity issues.  We of course handle referential
integrity in the application layer as well via SQLAlchemy ORM constructs.

3. Another aspect of FKs is using them for ON DELETE CASCADE.   I think
this is a great idea also, but I know that openstack apps are not
comfortable with this.  So we don't need to use it (but we should someday).



> 
> Currently the Nova DB has quite a lot of FKs but not on every relation.
> One example of a missing FK is between Instance.uuid and
> BandwidthUsageCache.uuid.
> 
> Should we drive one way or the other, or just put up with mixed-mode?
> 
> What should be the policy for new relations?

+1 for correct normalized with foreign keys in all cases.   A slowdown
that can be documented and illustrated will be needed to justify havint
that FK to be disabled or removed on the schema-side only, but there
would still be a "conceptual" foreign key (e.g. SQLAlchemy ForeignKey)
in the model.

> 
> Do the answers to these questions depend on having a sane and
> comprehensive archive/purge system in place?
> 
> 
> Alexis (lxsli)
> 



More information about the OpenStack-dev mailing list