[Nova][Scheduler] Reducing race-conditions and re-scheduling during creation of multiple high-ressources instances or instances with anti-affinity.

melanie witt melwittt at gmail.com
Tue May 19 23:17:31 UTC 2020


On 5/19/20 16:10, melanie witt wrote:
> On 5/19/20 15:23, Laurent Dumont wrote:
>>  From what we can gather, there are a couple of parameters that be be 
>> tweaked.
>>
>>  1. host_subset_size (Return X number of host instead of 1?)
>>  2. randomize_allocation_candidates (Not 100% on this one)
>>  3. shuffle_best_same_weighed_hosts (Return a random of X number of
>>     computes if they are all equal (instance of the same list for all
>>     scheduling requests))
>>  4. max_attempts (how many times the Scheduler will try to fit the
>>     instance somewhere)
>>
>> We've already raised "max_attempts" to 5 from the default of 3 and 
>> will raise it further. That said, what are the recommendations for the 
>> rest of the settings? We are not exactly concerned with stacking vs 
>> spreading (but that's always nice) of the instances but rather making 
>> sure deployments fail because of real reasons and not just because 
>> Nova/Scheduler keeps stepping on it's own toes.
> 
> This is something I've written in the past related to the anti-affinity 
> piece of what you're describing, that might be of help:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1780380#c4
> 
> Option (2) in your list only helps if you have > 1000 hosts in your 
> deployment and you want to make sure resource provider candidates beyond 
> the same first 1000 are regularly made available for scheduling (by 
> randomizing before returning the top 1000 weighted hosts). The placement 
> API will limit the maximum number of returned allocation candidates to 
> 1000 for performance reasons.

And for reference, here is where the limit of 1000 results comes from, 
it is configurable:

https://docs.openstack.org/nova/queens/configuration/config.html#scheduler.max_placement_results

> Option (3) in your list only helps if you have lots of hosts being 
> weighed equally and you need some randomization per exact weight to help 
> prevent collisions. This is usually applicable to requests for certain 
> NUMA topology and you get many hosts weighted equally.
> 
> Hope this helps,
> -melanie
> 




More information about the openstack-discuss mailing list