[openstack-dev] [nova] A prototype implementation towards the "shared state scheduler"

John Garbutt john at johngarbutt.com
Mon Feb 22 10:44:32 UTC 2016

On 21 February 2016 at 13:51, Cheng, Yingxin <yingxin.cheng at intel.com> wrote:
> On 19 February 2016 at 5:58, John Garbutt wrote:
>> On 17 February 2016 at 17:52, Clint Byrum <clint at fewbar.com> wrote:
>> > Excerpts from Cheng, Yingxin's message of 2016-02-14 21:21:28 -0800:
>> Long term, I see a world where there are multiple scheduler Nova is able to use,
>> depending on the deployment scenario.
> Technically, what I've implemented is a new type of scheduler host manager
> `shared_state_manager.SharedHostManager`[1] with the ability to synchronize host
> states directly from resource trackers.

Thats fine. You just get to re-use more code.

Maybe I should say multiple scheduling strategies, or something like that.

>> So a big question for me is, does the new scheduler interface work if you look at
>> slotting in your prototype scheduler?
>> Specifically I am thinking about this interface:
>> https://github.com/openstack/nova/blob/master/nova/scheduler/client/__init_
>> _.py

I am still curious if this interface is OK for your needs?

Making this work across both types of scheduler might be tricky, but I
think it is worthwhile.

>> > This mostly agrees with recent tests I've been doing simulating 1000
>> > compute nodes with the fake virt driver.
>> Overall this agrees with what I saw in production before moving us to the
>> caching scheduler driver.
>> I would love a nova functional test that does that test. It will help us compare
>> these different schedulers and find the strengths and weaknesses.
> I'm also working on implementing the functional tests of nova scheduler, there
> is a patch showing my latest progress: https://review.openstack.org/#/c/281825/
> IMO scheduler functional tests are not good at testing real performance of
> different schedulers, because all of the services are running as green threads
> instead of real processes. I think the better way to analysis the real performance
> and the strengths and weaknesses is to start services in different processes with
> fake virt driver(i.e. Clint Byrum's work) or Jay Pipe's work in emulating different
> designs.

Having an option to run multiple process seems OK, if its needed.
Although starting with a greenlet version that works in the gate seems best.

Lets try a few things, and see what predicts the results in real environments.

>> I am really interested how your prototype and the caching scheduler compare?
>> It looks like single node scheduler will perform in a very similar way, but multiple
>> schedulers are less likely to race each other, although there are quite a few
>> races?
> I think the major weakness of caching scheduler comes from its host state update
> model, i.e. updating host states from db every ` CONF.scheduler_driver_task_period`
> seconds.

The trade off is that consecutive scheduler decisions don't race each
other, at all. Say you have a burst of 1000 instance builds and you
want to avoid build failures (but accept sub optimal placement, and
you are using fill first), thats a very good trade off.

Consider a burst of 1000 deletes, it may take you 60 seconds to notice
they are all deleted and you have lots more free space, but that
doesn't cause build failures like excessive races for the same
resources will, at least under the usual conditions where you are not
yet totally full (i.e. non-HPC use cases).

I was shocked how well the caching_scheduler works in practice. I
assumed it would be terrible, but when we tried it, it worked well.
Its a million miles from perfect, but handy for many deployment


If you need a 1000 node test cluster to play with, its worth applying
to use this one:
I am happy to recommend these efforts gets some time with that hardware.

More information about the OpenStack-dev mailing list