Open Stack

Thu Nov 21 15:49:41 UTC 2013

On 11/21/2013 02:58 AM, Soren Hansen wrote:
> 2013/11/20 Chris Friesen <chris.friesen at windriver.com>:
>> What about a hybrid solution?
>> There is data that is only used by the scheduler--for performance reasons
>> maybe it would make sense to store that information in RAM as described at
>>
>> https://blueprints.launchpad.net/nova/+spec/no-db-scheduler
>>
>> For the rest of the data, perhaps it could be persisted using some alternate
>> backend.
>
> What would that solve?

The scheduler has performance issues.  Currently the design is 
suboptimal--the compute nodes write resource information to the 
database, then the scheduler pulls a bunch of data out of the database, 
copies it over into python, and analyzes it in python to do the filtering.

For large clusters this can lead to significant time spent scheduling.

Based on the above, for performance reasons it would be beneficial for 
the scheduler to have the necessary data already available in python 
rather than needing to pull it out of the database.

For other uses of the database people are proposing alternatives to SQL 
in order to get reliability.  I don't have any experience with that so I 
have no opinion on it.  But as long as the data is sitting on-disk (or 
even in a database process instead of in the scheduler process) it's 
going to slow down the scheduler.

If the primary consumer of a give piece of data (free ram, free cpu, 
free disk, etc) is the scheduler, then I think it makes sense for the 
compute nodes to report it directly to the scheduler.

Chris

Open Stack

[openstack-dev] [Nova] Does Nova really need an SQL database?

OpenStack

Community

Documentation

Branding & Legal