[Nova][Scheduler] Reducing race-conditions and re-scheduling during creation of multiple high-ressources instances or instances with anti-affinity.

Laurent Dumont laurentfdumont at gmail.com
Tue May 19 22:23:50 UTC 2020


Hey everyone,

We are seeing a pretty consistent issue with Nova/Scheduler where some
instances creation are hitting the "max_attempts" limits of the scheduler.

Env : Red Hat Queens
Computes : All the same hardware and specs (even weight throughout)
Nova : Three nova-schedulers

This can be due to two different factors (from what we've seen) :

   - Anti-affinity rules are getting triggered during the creation (two
   claims are done within a few milliseconds on the same compute) which counts
   as a retry (we've seen this when spawning 40+ VMs in a single server group
   with maybe 50-55 computes - or even less 14 instances on 20ish computes).
   - We've seen another case where MEMORY_MB becomes an issue (we are
   spinning new instances in the same host-aggregate where VMs are already
   running. Only one VM can run per compute but there are no anti-affinity
   groups to force that between the two deployments. The ressource
   requirements prevent anything else from getting spun on those).
   - The logs look like the following :
      -  Unable to submit allocation for instance
      659ef90e-33b8-42a9-9c8e-fac87278240d (409 {"errors": [{"status": 409,
      "request_id": "req-429c2734-2f2d-4d2d-82d1-fa4ebe12c991",
"detail": "There
      was a conflict when trying to complete your request.\n\n Unable
to allocate
      inventory: Unable to create allocation for 'MEMORY_MB' on
resource provider
      '35b78f3b-8e59-4f2f-8cad-eaf116b7c1c7'. The requested amount would exceed
      the capacity. ", "title": "Conflict"}]}) / Setting instance to ERROR
      state.: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted
      all hosts available for retrying build failures for instance
      f6d06cca-e9b5-4199-8220-e3ff2e5c2a41.
   - I do believe we are hitting this issue as well :
   https://bugs.launchpad.net/nova/+bug/1837955
      - In all the cases where the Stacks creation failed, one instance was
      left in the Build state for 120 minutes and then finally failed.

>From what we can gather, there are a couple of parameters that be be
tweaked.

   1. host_subset_size (Return X number of host instead of 1?)
   2. randomize_allocation_candidates (Not 100% on this one)
   3. shuffle_best_same_weighed_hosts (Return a random of X number of
   computes if they are all equal (instance of the same list for all
   scheduling requests))
   4. max_attempts (how many times the Scheduler will try to fit the
   instance somewhere)

We've already raised "max_attempts" to 5 from the default of 3 and will
raise it further. That said, what are the recommendations for the rest of
the settings? We are not exactly concerned with stacking vs spreading (but
that's always nice) of the instances but rather making sure deployments
fail because of real reasons and not just because Nova/Scheduler keeps
stepping on it's own toes.

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200519/ba048f84/attachment.html>


More information about the openstack-discuss mailing list