[placement][nova][ptg] resource provider affinity
jaypipes at gmail.com
Sat Apr 27 15:52:29 UTC 2019
On 04/26/2019 08:49 PM, Alex Xu wrote:
> Nadathur, Sundar <sundar.nadathur at intel.com
> Anyways, for Cyborg, it seems to me that there is a fairly
> straightforward scheme to address NUMA affinity: annotate the
> device’s nested RP with a trait indicating which NUMA node it
> belongs to (e.g. CUSTOM_NUMA_NODE_0), and use that to guide
> scheduling. This should be a valid use of traits because it
> expresses a property of the resource provider and is used for
> scheduling (only).
> I don't like the way of using trait to mark out the NUMA node.
Me neither. Traits are capabilities, not indicators of the relationship
between one provider and another.
The structure of hierarchical resource providers is what provides
topology information -- i.e. about how providers are related to each
other within a tree organization, and this is what is appropriate for
encoding NUMA topology information into placement.
The request should never ask for "NUMA Node 0". The reason is because
the request shouldn't require that the user understand where the
It shouldn't matter *which* NUMA node a particular device that is
providing some resources is affined to. The only thing that matters to a
*request* is that the user is able to describe the nature of the affinity.
I propose using a "group_policy=same_tree:$GROUP_A:$GROUP_B" query
parameter for enabling users to describe the affinity constraints for
various resources involved in different RequestGroups in the request spec.
group_policy=same_tree:$A:$B would mean "ensure that the providers that
match the constraints of request group $B are in the same inclusive tree
that matched for request group $A"
So, let's say you have a flavor that will consume:
2 dedicated host CPU processors
1 context/handle for an accelerator running a crypto algorithm
Further, you want to ensure that the provider tree that is providing
those dedicated CPUs and RAM will also provide the accelerator context
-- in other words, you are requesting a low level of latency between the
memory and the accelerator device itself.
The above request to GET /a_c would look like this:
which would mean, in English, "get me an accelerator context from an
FPGA that has been flashed with the 4AC1 crypto bitstream and is affined
to the NUMA node that is providing 4G of main memory and 2 dedicated
More information about the openstack-discuss