[openstack-dev] [Nova] Does Nova really need an SQL database?

Jay Pipes jaypipes at gmail.com
Mon Nov 18 20:14:25 UTC 2013


On 11/18/2013 02:35 PM, Mike Spreitzer wrote:
> There were some concerns expressed at the summit about scheduler
> scalability in Nova, and a little recollection of Boris' proposal to
>  keep the needed state in memory.

While it could be possible to do all of the scheduler state in memory, I
think a better (or at least, less cumbersome initially) approach would
be to add some layers of in-memory caching to any existing parts where
the scheduler currently makes a database query. The problem with this is
that you won't be able to scale out the design -- since the scheduler's
cached pieces cannot be shared easily across distributed nodes. This is
where the concept of using cells and a hierarchical "sieve scheduling"
pattern is used, where higher-level cell schedulers can quickly send a
scheduling request to another cell's scheduler based on a small amount
of information that can generally be compared against in-memory things
(like region, availability zone, type of hypervisor, etc...)

> I also heard one guy say that he thinks Nova does not really need a
> general SQL database, that a NOSQL database with a bit of
> denormalization and/or client-maintained secondary indices could
> suffice.  Has that sort of thing been considered before?  What is the
> community's level of interest in exploring that?

Good luck. :)  I don't think that whomever suggested that a NoSQL
database with a "bit of denormalization" would suffice for Nova realized
the extent to which the sets of data within Nova's database are highly
relational. You will just end up implementing JOIN algorithms in Python
code and make some of the more advanced search queries much slower, IMO.

Oh, and BTW, Nova's "database" was originally Redis [1] :)

Best,
-jay

[1]
https://github.com/openstack/nova/blob/bf6e6e718cdc7488e2da87b21e258ccc065fe499/nova/datastore.py



More information about the OpenStack-dev mailing list