On 10/09/2015 12:25 PM, Alec Hothan (ahothan) wrote:
> Still the point from Chris is valid. I guess the main reason openstack is
> going with multiple concurrent schedulers is to scale out by distributing the
> load between multiple instances of schedulers because 1 instance is too
> slow. This discussion is about coordinating the many instances of schedulers
> in a way that works and this is actually a difficult problem and will get
> worst as the number of variables for instance placement increases (for
> example NFV is going to require a lot more than just cpu pinning, huge pages
> and numa).
> Has anybody looked at why 1 instance is too slow and what it would take to
> make 1 scheduler instance work fast enough? This does not preclude the use of
> concurrency for finer grain tasks in the background.

Currently we pull data on all (!) of the compute nodes out of the database via a 
series of RPC calls, then evaluate the various filters in python code.

I suspect it'd be a lot quicker if each filter was a DB query.

Also, ideally we'd want to query for the most "strict" criteria first, to reduce 
the total number of comparisons.  For example, if you want to implement the 
"affinity" server group policy, you only need to test a single host.  If you're 
matching against host aggregate metadata, you only need to test against hosts in 
matching aggregates.


