[openstack-dev] [nova] Distributed Database
mbayer at redhat.com
Tue May 3 16:04:00 UTC 2016
On 05/02/2016 01:48 PM, Clint Byrum wrote:
> FWIW, I agree with you. If you're going to use SQLAlchemy, use it to
> take advantage of the relational model.
> However, how is what you describe a win? Whether you use SELECT .. FOR
> UPDATE, or a stored procedure, the lock is not distributed, and thus, will
> still suffer rollback failures in Galera. For single DB server setups, you
> don't have to worry about that, and SELECT .. FOR UPDATE will work fine.
Well it's a "win" vs. the lesser approach considered which also did not
include a distributed locking system like Zookeeper. It is also a win
even with a Zookeeper-like system in place because it allows a SQL query
to be much smarter about selecting data that involves IP numbers and
CIDRs, without the need to pull data into memory and process it there.
This is the most common mistake in SQL programming, not taking advantage
of SQL's set-based nature and instead pulling data into memory
Also, the "federated MySQL" approach of Cells V2 would still be OK with
pessimistic locking, since this lock is not "distributed" across the
entire dataspace. Only the usual Galera caveats apply, e.g. point to
only one galera "master" at a time and/or wait for Galera to support
"SELECT FOR UPDATE" across the cluster.
> Furthermore, any logic that happens inside the database server is extra
> load on a much much much harder resource to scale, using code that is
> much more complicated to update.
So I was careful to use the term "stored function" and not "stored
procedure". As ironic as it is for me to defend both the ORM
business-logic-in-the-application-not-the-database position, *and* the
let-the-database-do-things-not-the-application at the same time, using
database functions to allow new kinds of math and comparison operations
to take place over sets is entirely reasonable, and should not be
confused with the old-school big-business approach of building an entire
business logic layer as a huge wall of stored procedures, this is
nothing like that.
The Postgresql database has INET and CIDR types native which include the
same overlap logic we are implementing here as a MySQL stored function,
so the addition of math functions like these shouldn't be controversial.
The "load" of this function is completely negligible (however I would
be glad to assist in load testing it to confirm), especially compared to
pulling the same data across the wire, processing it in Python, then
sending just a tiny portion of it back again after we've extracted the
needle from the haystack.
In pretty much every kind of load testing scenario we do with Openstack,
the actual "load" on the database barely pushes anything. The only
database "resource" issue we have is Openstack using far more idle
connections than it should, which is on my end to work on improvements
to the connection pooling system which does not scale well across
Openstack's tons-of-processes model.
> To be clear, it's not the amount of data, but the size of the failure
> domain. We're more worried about what will happen to those 40,000 open
> connections from our 4000 servers when we do have to violently move them.
That's a really big number and I will admit I would need to dig into
this particular problem domain more deeply to understand what exactly
the rationale of that kind of scale would be here. But it does seem
like if you were using SQL databases, and the 4000 server system is in
fact grouped into hundreds of "silos" that only deal with strict
segments of the total dataspace, a federated approach would be exactly
what you'd want to go with.
> That particular problem isn't as scary if you have a large
> Cassandra/MongoDB/Riak/ROME cluster, as the client libraries are
> generally connecting to all or most of the nodes already, and will
> simply use a different connection if the initial one fails. However,
> these other systems also bring a whole host of new problems which the
> simpler SQL approach doesn't have.
Regarding ROME, I only seek to make the point that if you're going to
switch to NoSQL, you have to switch to NoSQL. Bolting SQLAlchemy on
top of Redis without a mature and widely-proven relational layer in
between, down to the level of replicating the actual tables that were
built within a relational schema, is a denial of the reality of the
problem to be solved.
> So it's worth doing an actual analysis of the failure handling before
> jumping to the conclusion that a pile of cells/sharding code or a rewrite
> to use a distributed database would be of benefit.
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
More information about the OpenStack-dev