[openstack-dev] [nova] Distributed Database

Andrew Laski andrew at lascii.com
Tue May 3 21:46:08 UTC 2016

On Mon, May 2, 2016, at 01:13 PM, Edward Leafe wrote:
> On May 2, 2016, at 10:51 AM, Mike Bayer <mbayer at redhat.com> wrote:
> >> Concretely, we think that there are three possible approaches:
> >>     1) We can use the SQLAlchemy API as the common denominator between a relational and non-relational implementation of the db.api component. These two implementation could continue to converge by sharing a large amount of code.
> >>     2) We create a new non-relational implementation (from scratch) of the db.api component. It would require probably more work.
> >>     3) We are also studying a last alternative: writing a SQLAlchemy engine that targets NewSQL databases (scalability + ACID):
> >>      - https://github.com/cockroachdb/cockroach
> >>      - https://github.com/pingcap/tidb
> > 
> > Going with a NewSQL backend is by far the best approach here.   That way, very little needs to be reinvented and the application's approach to data doesn't need to dramatically change.
> I’m glad that Matthieu responded, but I did want to emphasize one thing:
> of *course* this isn’t an ideal approach, but it *is* a practical one.
> The biggest problem in any change like this isn’t getting it to work, or
> to perform better, or anything else except being able to make the change
> while disrupting as little of the existing code as possible. Taking an
> approach that would be more efficient would be a non-starter since it
> wouldn’t provide a clean upgrade path for existing deployments.

I would like to point out that this same logic applies to the current
cellsv2 effort. It is a very practical set of changes which allows Nova
to move forward with only minor effort on the part of deployers. And it
moves towards a model that is already used and well understood by large
deployers of Nova while also learning from the shortcomings of the
previous architecture. In short, much of this is already battle tested
and proven.

If we started Nova from scratch, I hear golang is lovely for this sort
of thing, would we do things differently? Probably. However that's not
the position we're in. And we're able to make measurable progress with
cellsv2 at the moment and have a pretty clear idea of the end state. I
can recall conversations about NoSQL as far back as the San Diego
summit, which was my first so I can't say they didn't happen previously,
and this is the first time I've seen any measurable progress on moving
forward with it. But where it would go is not at all clear.

I also want to point out that what was being solved with ROME and what
cellsv2 is solving are two very different things. I saw the talk and was
very impressed, but it was looking to improve upon db access times in a
very specific deployment type. And I didn't get the sense that the point
being made was that ROME/redis was the best solution generally, but for
very geographically distributed controllers with a shared database it
performed much better than an active/active Galera cluster with a large
number of nodes.

> By getting this working without ripping out all of the data models that
> currently exist is an amazing feat. And if by doing so it shows that a
> distributed database is indeed possible, it’s done more than anything
> else that has ever been discussed in the past few years. 
> -- Ed Leafe
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list