[openstack-dev] [nova] Distributed Database
Clint Byrum
clint at fewbar.com
Fri Apr 29 05:53:07 UTC 2016
Excerpts from Mike Bayer's message of 2016-04-28 22:16:54 -0500:
>
> On 04/28/2016 08:25 PM, Edward Leafe wrote:
>
> > Your own tests showed that a single RDBMS instance doesn’t even break a sweat
> > under your test loads. I don’t see why we need to shard it in the first
> > place, especially if in doing so we add another layer of complexity and
> > another dependency in order to compensate for that choice. Cells are a useful
> > concept, but this proposed implementation is adding way too much complexity
> > and debt to make it worthwhile.
>
> now that is a question I have also. Horizontal sharding is usually for
> the case where you need to store say, 10B rows, and you'd like to split
> it up among different silos. Nothing that I've seen about Nova suggests
> this is a system with any large data requirements, or even medium size
> data (a few million rows in relational databases is nothing). I
> didn't have the impression that this was the rationale behind Cells, it
> seems like this is more of some kind of logical separation of some kind
> that somehow suits some environments (but I don't know how).
> Certainly, if you're proposing a single large namespace of data across a
> partition of nonrelational databases, and then the data size itself is
> not that large, as long as "a single namespace" is appropriate then
> there's no reason to break out of more than one MySQL database. There's
> not much reason to transparently shard unless you are concerned about
> adding limitless storage capacity. The Cells sharding seems to be
> intentionally explicit and non-transparent.
>
There's a bit more to it than the number of rows. There's also a desire
to limit failure domains. IMO, that is entirely unfounded, as I've run
thousands of servers that depended on a single pair of MySQL servers
using simple DRBD and pacemaker with a floating IP for failover. This
is the main reason MySQL is a thing... it can handle 100,000 concurrent
connections just fine, and the ecosystem around detecting and handling
failure/maintenance is mature.
The whole cells conversation, IMO, stems from the way we use RabbitMQ.
We should just stop doing that. I know as I move forward with our scaling
efforts, I'll be trying several RPC drivers and none of them will go
through RabbitMQ.
More information about the OpenStack-dev
mailing list