Open Stack

Fri Jul 15 08:26:52 UTC 2016

It is easy to understand that scheduling in nova-scheduler service consists of 2 major phases:
A. Cache refresh, in code [1].
B. Filtering and weighing, in code [2].

Couple of previous experiments [3] [4] shows that “cache-refresh” is the major bottleneck of nova scheduler. For example, the 15th page of presentation [3] says the time cost of “cache-refresh” takes 98.5% of time of the entire `_schedule` function [6], when there are 200-1000 nodes and 50+ concurrent requests. The latest experiments [5] in China Mobile’s 1000-node environment also prove the same conclusion, and it’s even 99.7% when there’re 40+ concurrent requests.

Here’re some existing solutions for the “cache-refresh” bottleneck:
I. Caching scheduler.
II. Scheduler filters in DB [7].
III. Eventually consistent scheduler host state [8].

I can discuss their merits and drawbacks in a separate thread, but here I want to show a simplest solution based on my findings during the experiments [5]. I wrapped the expensive function [1] and tried to see the behavior of cache-refresh under pressure. It is very interesting to see a single cache-refresh only costs about 0.3 seconds. And when there’re concurrent cache-refresh operations, this cost can be suddenly increased to 8 seconds. I’ve seen it even reached 60 seconds for one cache-refresh under higher pressure. See the below section for details.

It raises a question in the current implementation: Do we really need a cache-refresh operation [1] for *every* requests? If those concurrent operations are replaced by one database query, the scheduler is still happy with the latest resource view from database. Scheduler is even happier because those expensive cache-refresh operations are minimized and much faster (0.3 seconds). I believe it is the simplest optimization to scheduler performance, which doesn’t make any changes in filter scheduler. Minor improvements inside host manager is enough.

[1] https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L104 
[2] https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L112-L123
[3] https://www.openstack.org/assets/presentation-media/7129-Dive-into-nova-scheduler-performance-summit.pdf 
[4] http://lists.openstack.org/pipermail/openstack-dev/2016-June/098202.html 
[5] Please refer to Barcelona summit session ID 15334 later: “A tool to test and tune your OpenStack Cloud? Sharing our 1000 node China Mobile experience.”
[6] https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L53
[7] https://review.openstack.org/#/c/300178/
[8] https://review.openstack.org/#/c/306844/

****** Here is the discovery from latest experiments [5] ******
https://docs.google.com/document/d/1N_ZENg-jmFabyE0kLMBgIjBGXfL517QftX3DW7RVCzU/edit?usp=sharing 

The figure 1 illustrates the concurrent cache-refresh operations in a nova scheduler service. There’re at most 23 requests waiting for the cache-refresh operations at time 43s.

The figure 2 illustrates the time cost of every requests in the same experiment. It shows that the cost is increased with the growth of concurrency. It proves the vicious circle that a request will wait longer for the database when there’re more waiting requests.

The figure 3/4 illustrate a worse case when the cache-refresh operation costs reach 60 seconds because of excessive cache-refresh operations.

-- 
Regards
Yingxin

Open Stack

[openstack-dev] [nova][scheduler] A simple solution for better scheduler performance

OpenStack

Community

Documentation

Branding & Legal