[openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?
Ben Nemec
openstack at nemebean.com
Fri May 26 20:46:51 UTC 2017
On 05/26/2017 12:17 PM, Edward Leafe wrote:
> [resending to include the operators list]
>
> The host_subset_size configuration option was added to the scheduler to help eliminate race conditions when two requests for a similar VM would be processed close together, since the scheduler’s algorithm would select the same host in both cases, leading to a race and a likely failure to build for the second request. By randomly choosing from the top N hosts, the likelihood of a race would be reduced, leading to fewer failed builds.
>
> Current changes in the scheduling process now have the scheduler claiming the resources as soon as it selects a host. So in the case above with 2 similar requests close together, the first request will claim successfully, but the second will fail *while still in the scheduler*. Upon failing the claim, the scheduler will simply pick the next host in its weighed list until it finds one that it can claim the resources from. So the host_subset_size configuration option is no longer needed.
>
> However, we have heard that some operators are relying on this option to help spread instances across their hosts, rather than using the RAM weigher. My question is: will removing this randomness from the scheduling process hurt any operators out there? Or can we safely remove that logic?
We used host_subset_size to schedule randomly in one of the TripleO CI
clouds. Essentially we had a heterogeneous set of hardware where the
numerically larger (more RAM, more disk, equal CPU cores) systems were
significantly slower. This caused them to be preferred by the scheduler
with a normal filter configuration, which is obviously not what we
wanted. I'm not sure if there's a smarter way to handle it, but setting
host_subset_size to the number of compute nodes and disabling basically
all of the weighers allowed us to equally distribute load so at least
the slow nodes weren't preferred.
That said, we're migrating away from that frankencloud so I certainly
wouldn't block any scheduler improvements on it. I'm mostly chiming in
to describe a possible use case. And please feel free to point out if
there's a better way to do this. :-)
More information about the OpenStack-dev
mailing list