[openstack-dev] [nova] FKs in the DB
Clint Byrum
clint at fewbar.com
Fri Nov 20 21:33:58 UTC 2015
Excerpts from Mike Bayer's message of 2015-11-20 11:29:31 -0800:
>
> On 11/20/2015 11:19 AM, Alexis Lee wrote:
> > We just had a fun discussion in IRC about whether foreign keys are evil.
> > Initially I thought this was crazy but mordred made some good points. To
> > paraphrase, that if you have a scale-out app already it's easier to
> > manage integrity in your app than scale-out your persistence layer.
>
> I've had this argument with mordred before, and it seems again there's
> the same misunderstanding going on:
>
> 1. Your application can have **conceptual** foreign keys in it, without
> actually having foreign keys **for real** in the database. This means
> your SQLAlchemy code still does ForeignKey, ForeignKeyConstraint, and
> most importantly your **database still uses normal form**, that is, any
> row that refers to another does it based on a set of columns that
> exactly match to the primary key of a single table elsewhere (not to
> multiple tables, not to a function of the columns cast from int to
> string and concatenated to the value in the other table etc, an *exact
> match*). I'm sure that mordred agrees with all of these practices,
> however when one says "we aren't going to use foreign keys anymore",
> typically it is all these critical schema design practices that go out
> the window. Put another way, the foreign key concept not only
> constrains data in a real database, just the concept of them constraints
> the **developer** to use correct normal form.
>
Mike, thanks for making that clarification. I agree, that conceptual
FK's are not the same as FK constraints in the DB. Joins are not demons.
:)
To be clear, while what you say above is all true, normal form isn't
actually a goal of any system. It's a tactic one can use with a well known
efficiency profile. But there are times when it costs more than other
more brutal, less civilized methods of database design and usage. If we
don't measure our efficiency, we won't actually know if this is one of
those times or not.
> 2. Here's the part mordred doesn't like - the FK is actually in the
> database for real. This is because they slow down inserts, updates,
> and deletes, because they must be checked. To which I say, no such
> performance issue has been observed or documented in Openstack, we
> aren't a 1 million TPS financial system, so this is vastly premature
> optimization.
>
I agree with you that this is unmeasured. I don't agree that we are not a
1 million TPS financial system, because the goal of running a cloud for
many is, in fact, to make money. So while we may not have an example of
a cloud running at 1 million TPS, it's not something we should dismiss
too quickly.
That said, the measurement should come first. What I'd like to show
is how many TPS we do actually do on boots, deletes, etc. etc. I'm
working on it now, and I'd encourage people to join the effort on the
counter-inspection QA spec if they want to get started measuring things.
We're taking baby steps right now, but eventually I see us producing a
lot of data that should be helpful in answering some of these questions.
> Also as far as #2, customers and operators *regularly* run scripts and
> queries to modify openstack databases, particularly to delete soft
> deleted rows. These get blocked *all the time* by foreign key
> constraints. They are doing their job, and they are needed as a final
> guard against data integrity issues. We of course handle referential
> integrity in the application layer as well via SQLAlchemy ORM constructs.
>
Nobody ever doubts that there are times where database-side FK constraints
help prevent costly mistakes. The question is always: at what cost? Right
now, I'd say we don't really know because we're not measuring. That's
fine, if you want to mitigate risk, one strategy is to buy insurance,
and that's what the FK's in the DB are: insurance.
More information about the OpenStack-dev
mailing list