[ops][nova][placement] NUMA topology vs non-NUMA workloads

Alexandre Arents alexandre.arents at corp.ovh.com
Wed Jun 5 14:38:12 UTC 2019

>From OVH point of view,

We do not plan for now to mix NUMA aware and NUMA unaware workload on same compute.
So you can go ahead without "can_split" feature if it helps.


>This message is primarily addressed at operators, and of those,
>operators who are interested in effectively managing and mixing
>workloads that care about NUMA with workloads that do not. There are
>some questions within, after some background to explain the issue.
>At the PTG, Nova and Placement developers made a commitment to more
>effectively manage NUMA topologies within Nova and Placement. On the
>placement side this resulted in a spec which proposed several
>features that would enable more expressive queries when requesting
>allocation candidates (places for workloads to go), resulting in
>fewer late scheduling failures.
>At first there was one spec that discussed all the features. This
>morning it was split in two because one of the features is proving
>hard to resolve. Those two specs can be found at:
>* https://review.opendev.org/658510 (has all the original discussion)
>* https://review.opendev.org/662191 (the less contentious features split out)
>After much discussion, we would prefer to not do the feature
>discussed in 658510. Called 'can_split', it would allow specified
>classes of resource (notably VCPU and memory) to be split across
>multiple numa nodes when each node can only contribute a portion of
>the required resources and where those resources are modelled as
>inventory on the NUMA nodes, not the host at large.
>While this is a good idea in principle it turns out (see the spec)
>to cause many issues that require changes throughout the ecosystem,
>for example enforcing pinned cpus for workloads that would normally
>float. It's possible to make the changes, but it would require
>additional contributors to join the effort, both in terms of writing
>the code and understanding the many issues.
>So the questions:
>* How important, in your cloud, is it to co-locate guests needing a
>   NUMA topology with guests that do not? A review of documentation
>   (upstream and vendor) shows differing levels of recommendation on
>   this, but in many cases the recommendation is to not do it.
>* If your answer to the above is "we must be able to do that": How
>   important is it that your cloud be able to pack workloads as tight
>   as possible? That is: If there are two NUMA nodes and each has 2
>   VCPU free, should a 4 VCPU demanding non-NUMA workload be able to
>   land there? Or would you prefer that not happen?
>* If the answer to the first question is "we can get by without
>   that" is it satisfactory to be able to configure some hosts as NUMA
>   aware and others as not, as described in the "NUMA topology with
>   RPs" spec [1]? In this set up some non-NUMA workloads could end up
>   on a NUMA host (unless otherwise excluded by traits or aggregates),
>   but only when there was contiguous resource available.
>This latter question articulates the current plan unless responses
>to this message indicate it simply can't work or legions of
>assistance shows up. Note that even if we don't do can_split, we'll
>still be enabling significant progress with the other features
>described in the second spec [2].
>Thanks for your help in moving us in the right direction.
>[1] https://review.opendev.org/552924
>[2] https://review.opendev.org/662191
>Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
>freenode: cdent

Alexandre Arents

More information about the openstack-discuss mailing list