[openstack-dev] [Nova] Does Nova really need an SQL database?

Stephen Gran stephen.gran at theguardian.com
Thu Nov 21 16:52:09 UTC 2013

On 21/11/13 15:49, Chris Friesen wrote:
> On 11/21/2013 02:58 AM, Soren Hansen wrote:
>> 2013/11/20 Chris Friesen <chris.friesen at windriver.com>:
>>> What about a hybrid solution?
>>> There is data that is only used by the scheduler--for performance
>>> reasons
>>> maybe it would make sense to store that information in RAM as
>>> described at
>>> https://blueprints.launchpad.net/nova/+spec/no-db-scheduler
>>> For the rest of the data, perhaps it could be persisted using some
>>> alternate
>>> backend.
>> What would that solve?
> The scheduler has performance issues. Currently the design is
> suboptimal--the compute nodes write resource information to the
> database, then the scheduler pulls a bunch of data out of the database,
> copies it over into python, and analyzes it in python to do the filtering.
> For large clusters this can lead to significant time spent scheduling.
> Based on the above, for performance reasons it would be beneficial for
> the scheduler to have the necessary data already available in python
> rather than needing to pull it out of the database.
> For other uses of the database people are proposing alternatives to SQL
> in order to get reliability. I don't have any experience with that so I
> have no opinion on it. But as long as the data is sitting on-disk (or
> even in a database process instead of in the scheduler process) it's
> going to slow down the scheduler.
> If the primary consumer of a give piece of data (free ram, free cpu,
> free disk, etc) is the scheduler, then I think it makes sense for the
> compute nodes to report it directly to the scheduler.

I suspect that a large performance gain could be had by 2 fairly simple 

a) Break the scheduler in two, so that the chunk of code receiving 
updates from the compute nodes can't block the chunk of code scheduling 

b) Use a memcache backend instead of SQL for compute resource information.

My fear with keeping data local to a scheduler instance is that local 
state destroys scalability.

Just a thought.

Stephen Gran
Senior Systems Integrator - theguardian.com
Please consider the environment before printing this email.
Visit theguardian.com   

On your mobile, download the Guardian iPhone app theguardian.com/iphone and our iPad edition theguardian.com/iPad   
Save up to 33% by subscribing to the Guardian and Observer - choose the papers you want and get full digital access.
Visit subscribe.theguardian.com

This e-mail and all attachments are confidential and may also
be privileged. If you are not the named recipient, please notify
the sender and delete the e-mail and all attachments immediately.
Do not disclose the contents to another person. You may not use
the information for any purpose, or store, or copy, it in any way.
Guardian News & Media Limited is not liable for any computer
viruses or other material transmitted with or as part of this
e-mail. You should employ virus checking software.
Guardian News & Media Limited
A member of Guardian Media Group plc
Registered Office
PO Box 68164
Kings Place
90 York Way
Registered in England Number 908396


More information about the OpenStack-dev mailing list