[openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

Jay Pipes jaypipes at gmail.com
Fri May 26 17:28:24 UTC 2017


On 05/26/2017 01:14 PM, Edward Leafe wrote:
> The host_subset_size configuration option was added to the scheduler to help eliminate race conditions when two requests for a similar VM would be processed close together, since the scheduler’s algorithm would select the same host in both cases, leading to a race and a likely failure to build for the second request. By randomly choosing from the top N hosts, the likelihood of a race would be reduced, leading to fewer failed builds.
> 
> Current changes in the scheduling process now have the scheduler claiming the resources as soon as it selects a host. So in the case above with 2 similar requests close together, the first request will claim successfully, but the second will fail *while still in the scheduler*. Upon failing the claim, the scheduler will simply pick the next host in its weighed list until it finds one that it can claim the resources from. So the host_subset_size configuration option is no longer needed.
> 
> However, we have heard that some operators are relying on this option to help spread instances across their hosts, rather than using the RAM weigher. My question is: will removing this randomness from the scheduling process hurt any operators out there? Or can we safely remove that logic?

Actually, I don't believe this should be removed. The randomness that is 
injected into the placement decision using this configuration setting is 
useful for reducing contention even in the scheduler claim process.

When benchmarking claims in the scheduler here:

https://github.com/jaypipes/placement-bench

I found that the use of a "partitioning strategy" resulted in dramatic 
reduction in lock contention in the claim process. The modulo and random 
partitioning strategies both seemed to work pretty well for reducing 
lock retries.

So, in short, I'd say keep it.

Best,
-jay



More information about the OpenStack-dev mailing list