Clint Byrum clint at fewbar.com
Tue Mar 10 01:27:35 UTC 2015

Excerpts from Mike Bayer's message of 2015-03-09 17:26:36 -0700:
> Clint Byrum <clint at fewbar.com> wrote:
> > 
> > So I think I didn't speak clearly enough here. The benchmarks are of
> > course needed, but there's a tipping point when write activity gets to
> > a certain level where it's cheaper to let it get a little skewed and
> > correct asynchronously. This is not unique to SQL, this is all large
> > scale distributed systems. There's probably a super cool formula for it
> > too, but roughly it is
> > 
> > (num_trans_per_s * cost_of_fk_check_per_trans)
> > 
> > versus
> > 
> > (error_cost * error_rate)+(cost_find_all_errors/seconds_to_find_all_errors)
> Well the error cost here would be a database that would be “corrupted”,
> meaning it has rows which no longer refer to things that exist and the
> database is now in a case where it may very well be unusable by the
> application, without being rolled back to some known state. 

That's not a cost, that's a situation. What's the actual cost to the
user? "may very well be unusable" implies uncertainty, which is certainly
a risk, but the cost is unknown. Typically one must estimate the cost
with each error found.

> If Keystone truly doesn’t care about ACID it might want to consider MyISAM
> tables, which are faster for read-heavy workloads, though these aren’t
> compatible with Galera.

Please try to refrain from using false equivalence. ACID stands for
Atomicity, Consistency, Isolation, Durability. Nowhere in there does it
stand for "referential integrity". If Keystone uses transactions
properly, ACID is preserved. Also I don't think it is productive to
bring up MyISAM in any serious conversation about databases.

> > So it's not really something I think one can blindly accept as "better",
> > but rather something that one needs to calculate for themselves. You say
> > cost_of_fk_check_per_trans is negligible, but that has been measured as
> > not true in the past:
> > 
> > http://www.percona.com/blog/2010/09/20/instrumentation-and-the-cost-of-foreign-keys/
> That’s not a surprising case because the “parent” row being modified is
> being referred to by the “child” row that’s still in transaction. This is an
> implementation detail of the ACID guarantees which one gets when they use a
> relational database. If Keystone’s relational backend in fact has a
> performance bottleneck due to an operation like this, it should be visited
> individually. But I think it’s extremely unlikely this is actually the case.

Lock contention is a real thing that will inevitably slow down transaction
speed if not carefully avoided. One less query (which is what FK checks
end up being) means one less read lock taken and one less place to have
to think through.

In practical matters, the fact that identity and assignment are not
allowed to FK does practically shutdown most of the real possibilities
of this type of contention.

> > That article demonstrates that the FK adds lock contention in
> > InnoDB. There's more. With NDB (MySQL cluster) it's an 18% performance
> > hit on raw throughput:
> > 
> > http://johanandersson.blogspot.com/2013/06/benchmarking-performance-impact-of.html
> For NDB cluster, foreign key support was only added to that system two years
> ago, in version 5.6.10 in 2013. This is clearly not a system designed to
> support foreign keys in the first place, the feature is entirely bleeding
> edge for that specific system, and performance like that is entirely
> atypical outside for database systems outside of NDB cluster. Specifically
> with Openstack, the clustering solution usually used is Galera which has no
> such performance issue.
> So sure, if you’re using NDB cluster, FOREIGN KEY support is
> bleeding edge and you may very well want to disable constraints as you’re
> using a system that wasn’t designed with this use case in mind. But because
> using a relational database is somewhat pointless if you don’t need ACID,
> I’d probably use Galera instead.

NDB is probably overkill for Keystone until we get up into the millions
of users scale. One day maybe :). The point is that this is a high performance
DB with high performance demands and it is 18% slower for some types of
operations when FK's are added.

> > 
> > Now, where that point is with Keystone I don't know. The point is, if you
> > write the code relying on the existence, Keystone becomes a vertically
> > scaling app that cannot ever scale out beyond whatever that limit is.
> There seems to be some misunderstanding that using foreign keys to enforce
> referential integrity seems to imply that the application is now dependent
> on these constraints being in place. I notice that the conversation was
> originally talking a bit about allowing rows to be deleted using CASCADE,
> and my original question referred to the notion of foreign key use
> *overall*, not specifically as a means to offer automatic deletion of
> related rows with CASCADE.   The use of foreign key constraints
> in openstack applications does not imply an unbreakable reliance
> upon them at all, for two reasons.
> For the first reason, foreign keys first and foremost offer nothing more
> than an integrity guarantee that prevents a particular row from being
> deleted or having its primary key modified such that other rows which refer
> to that primary key would now be left with an invalid reference. At this
> level, you can have an application that is working perfectly, you can then
> shut off the foreign key constraints entirely, and the application will
> continue to work perfectly with no change; the difference is only that if
> the application at some point failed, due to bugs either present or newly
> introduced, or if an end user tried to tinker with the database directly,
> operations which would leave referencing rows hanging will not be blocked.
> So the use of foreign keys at this level has nothing to do with the
> application relying upon their existence; they are nothing more than
> integrity guarantees which can be silently removed.

I'm not entirely sure what you've said above actually prevents coders
from relying on the constraints. Being careful about deleting all of the
child rows before a parent is good practice. I have seen code like this
in the past though:

except ForeignKeyFailure:

This means if you don't have the FK's, you may never delete the
children. Is this a bug? YES. Is it super obvious that it is the wrong
thing to do? No.

> For the second, within the realm of ON DELETE CASCADE specifically, Keystone
> like all other Openstack applications uses SQLAlchemy for relational
> persistence. SQLAlchemy supports foreign key constraints that support or
> don’t support ON DELETE CASCADE transparently; meaning, whether or not a
> foreign key can be relied upon to delete dependent rows is nothing more than
> a configuration option in the mappings; SQLAlchemy will continue to emit a
> DELETE statement for rows that are locally mirrored in memory but if it
> knows that the foreign key has ON DELETE CASCADE set up, it will let the
> rows that are unloaded be deleted automatically. With the flag off, it will
> attempt to locate those rows as well. So in this situation, the use of ON
> DELETE CASCADE *improves* performance, by reducing database round trips.

Right, which is why as usual "it depends". If Keystone did a lot of
child-row deletes, then it is a net gain in latency. But if it does a
lot of inserts it's a huge loss, because every time you insert a row that
has an FK constraint, you need to do an index lookup which is effectively
an extra query which your severely limited central database has to
handle all by itself without help of slaves.

More information about the OpenStack-dev mailing list