[openstack-dev] [nova] Distributed Database

Mark Doffman mdoffman at linux.vnet.ibm.com
Wed May 4 00:05:54 UTC 2016


This thread has been a depressing read.

I understand that the content is supposed to be distributed databases 
but for me it has become an inquisition of cellsV2.

Our question has clearly become "Should we continue efforts on 
cellsV2?", which I will address head-on.

We shouldn't be afraid to abandon CellsV2. If there are designs that are 
proven to be a better solution then our current momentum shouldn't keep 
us from an abrupt change. As someone who is working on this I have an 
attachment to the current design, but Its important for me to keep an 
open mind.

Here are my *main* reasons for continuing work on CellsV2.

1. It provides a proven solution to an immediate message queue problem.

Yes CellsV2 is different to CellsV1, but the previous solution showed 
that application-level sharding of the message queue can work. CellsV2 
provides this solution with a (moderately) easy upgrade path for 
existing deployments. These deployments may not be comfortable with 
changing MQ technologies or may already be using CellsV1. Application 
level sharding of the message queue is not pretty, but will work.

2. The 'complexity' of CellsV2 is vastly overstated.

Sure there is a-lot of *work* to do for cellsv2, but this doesn't imply 
increased complexity: any refactoring requires work. CellsV1 added 
complexity to our codebase, Cellsv2 does not. In-fact by clearly 
separating data that is 'owned'by the different services we have I 
believe that we are improving the modularity and encapsulation present 
in Nova.

3. CellsV2 does not prohibit *ANY* of the alternative scaling methods
    mentioned in this thread.

Really, it doesn't. Both message queue and database switching are 
completely optional. Both in the sense of running a single cell, and 
even when running multiple cells. If anything, the ability to run 
separate message queues and database connections could give us the 
ability to trial these alternative technologies within a real, running, 
cloud.

Just imagine the ability to set up a cell in your existing cloud that 
runs 0mq rather than rabbit. How about a NewSQL database integrated in 
to an existing cloud? Both of these things may (With some work) be possible.



I could go on, but I won't. These are my main reasons and I'll stick to 
them.

Its difficult to be proven wrong, but sometimes necessary to get the 
best product that we can. I don't think that the existence of 
alternative message queue and database options is enough to stop cellsV2 
work now. A proven solution, that meets the upgrade constraints that we 
have in Nova, would be a good reason to do so. We should of-course 
explore other options, nothing we are doing prevents that. When they 
work out, I'll be super excited.

Thanks

Mark

On 4/29/16 12:53 AM, Clint Byrum wrote:
> Excerpts from Mike Bayer's message of 2016-04-28 22:16:54 -0500:
>>
>> On 04/28/2016 08:25 PM, Edward Leafe wrote:
>>
>>> Your own tests showed that a single RDBMS instance doesn’t even break a sweat
>>> under your test loads. I don’t see why we need to shard it in the first
>>> place, especially if in doing so we add another layer of complexity and
>>> another dependency in order to compensate for that choice. Cells are a useful
>>> concept, but this proposed implementation is adding way too much complexity
>>> and debt to make it worthwhile.
>>
>> now that is a question I have also.  Horizontal sharding is usually for
>> the case where you need to store say, 10B rows, and you'd like to split
>> it up among different silos.  Nothing that I've seen about Nova suggests
>> this is a system with any large data requirements, or even medium size
>> data (a few million rows in relational databases is nothing).    I
>> didn't have the impression that this was the rationale behind Cells, it
>> seems like this is more of some kind of logical separation of some kind
>> that somehow suits some environments (but I don't know how).
>> Certainly, if you're proposing a single large namespace of data across a
>> partition of nonrelational databases, and then the data size itself is
>> not that large, as long as "a single namespace" is appropriate then
>> there's no reason to break out of more than one MySQL database.  There's
>> not much reason to transparently shard unless you are concerned about
>> adding limitless storage capacity.   The Cells sharding seems to be
>> intentionally explicit and non-transparent.
>>
>
> There's a bit more to it than the number of rows. There's also a desire
> to limit failure domains. IMO, that is entirely unfounded, as I've run
> thousands of servers that depended on a single pair of MySQL servers
> using simple DRBD and pacemaker with a floating IP for failover. This
> is the main reason MySQL is a thing... it can handle 100,000 concurrent
> connections just fine, and the ecosystem around detecting and handling
> failure/maintenance is mature.
>
> The whole cells conversation, IMO, stems from the way we use RabbitMQ.
> We should just stop doing that. I know as I move forward with our scaling
> efforts, I'll be trying several RPC drivers and none of them will go
> through RabbitMQ.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list