[openstack-dev] [nova] Distributed Database

Mike Bayer mbayer at redhat.com
Tue May 3 16:04:00 UTC 2016

On 05/02/2016 01:48 PM, Clint Byrum wrote:
> FWIW, I agree with you. If you're going to use SQLAlchemy, use it to
> take advantage of the relational model.
> However, how is what you describe a win? Whether you use SELECT .. FOR
> UPDATE, or a stored procedure, the lock is not distributed, and thus, will
> still suffer rollback failures in Galera. For single DB server setups, you
> don't have to worry about that, and SELECT .. FOR UPDATE will work fine.

Well it's a "win" vs. the lesser approach considered which also did not 
include a distributed locking system like Zookeeper.   It is also a win 
even with a Zookeeper-like system in place because it allows a SQL query 
to be much smarter about selecting data that involves IP numbers and 
CIDRs, without the need to pull data into memory and process it there. 
This is the most common mistake in SQL programming, not taking advantage 
of SQL's set-based nature and instead pulling data into memory 

Also, the "federated MySQL" approach of Cells V2 would still be OK with 
pessimistic locking, since this lock is not "distributed" across the 
entire dataspace.   Only the usual Galera caveats apply, e.g. point to 
only one galera "master" at a time and/or wait for Galera to support 
"SELECT FOR UPDATE" across the cluster.

> Furthermore, any logic that happens inside the database server is extra
> load on a much much much harder resource to scale, using code that is
> much more complicated to update.

So I was careful to use the term "stored function" and not "stored 
procedure".   As ironic as it is for me to defend both the ORM 
business-logic-in-the-application-not-the-database position, *and* the 
let-the-database-do-things-not-the-application at the same time, using 
database functions to allow new kinds of math and comparison operations 
to take place over sets is entirely reasonable, and should not be 
confused with the old-school big-business approach of building an entire 
business logic layer as a huge wall of stored procedures, this is 
nothing like that.

The Postgresql database has INET and CIDR types native which include the 
same overlap logic we are implementing here as a MySQL stored function, 
so the addition of math functions like these shouldn't be controversial. 
   The "load" of this function is completely negligible (however I would 
be glad to assist in load testing it to confirm), especially compared to 
pulling the same data across the wire, processing it in Python, then 
sending just a tiny portion of it back again after we've extracted the 
needle from the haystack.

In pretty much every kind of load testing scenario we do with Openstack, 
the actual "load" on the database barely pushes anything.   The only 
database "resource" issue we have is Openstack using far more idle 
connections than it should, which is on my end to work on improvements 
to the connection pooling system which does not scale well across 
Openstack's tons-of-processes model.

> To be clear, it's not the amount of data, but the size of the failure
> domain. We're more worried about what will happen to those 40,000 open
> connections from our 4000 servers when we do have to violently move them.

That's a really big number and I will admit I would need to dig into 
this particular problem domain more deeply to understand what exactly 
the rationale of that kind of scale would be here.   But it does seem 
like if you were using SQL databases, and the 4000 server system is in 
fact grouped into hundreds of "silos" that only deal with strict 
segments of the total dataspace, a federated approach would be exactly 
what you'd want to go with.

> That particular problem isn't as scary if you have a large
> Cassandra/MongoDB/Riak/ROME cluster, as the client libraries are
> generally connecting to all or most of the nodes already, and will
> simply use a different connection if the initial one fails. However,
> these other systems also bring a whole host of new problems which the
> simpler SQL approach doesn't have.

Regarding ROME, I only seek to make the point that if you're going to 
switch to NoSQL, you have to switch to NoSQL.   Bolting SQLAlchemy on 
top of Redis without a mature and widely-proven relational layer in 
between, down to the level of replicating the actual tables that were 
built within a relational schema, is a denial of the reality of the 
problem to be solved.

> So it's worth doing an actual analysis of the failure handling before
> jumping to the conclusion that a pile of cells/sharding code or a rewrite
> to use a distributed database would be of benefit.
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list