Hey everyone,

We are seeing a pretty consistent issue with Nova/Scheduler where some instances creation are hitting the "max_attempts" limits of the scheduler.

Env : Red Hat Queens
Computes : All the same hardware and specs (even weight throughout)
Nova : Three nova-schedulers

This can be due to two different factors (from what we've seen) :
From what we can gather, there are a couple of parameters that be be tweaked.
  1. host_subset_size (Return X number of host instead of 1?)
  2. randomize_allocation_candidates (Not 100% on this one)
  3. shuffle_best_same_weighed_hosts (Return a random of X number of computes if they are all equal (instance of the same list for all scheduling requests))
  4. max_attempts (how many times the Scheduler will try to fit the instance somewhere)
We've already raised "max_attempts" to 5 from the default of 3 and will raise it further. That said, what are the recommendations for the rest of the settings? We are not exactly concerned with stacking vs spreading (but that's always nice) of the instances but rather making sure deployments fail because of real reasons and not just because Nova/Scheduler keeps stepping on it's own toes. 

Thanks!