[openstack-dev] [placement][nova] Decision time on granular request groups for like resources

Eric Fried openstack at fried.cc
Wed Apr 18 20:52:36 UTC 2018


I can't tell if you're being facetious, but this seems sane, albeit
complex.  It's also extensible as we come up with new and wacky affinity
semantics we want to support.

I can't say I'm sold on requiring `proximity` qparams that cover every
granular group - that seems like a pretty onerous burden to put on the
user right out of the gate.  That said, the idea of not having a default
is quite appealing.  Perhaps as a first pass we can require a single
?proximity={isolate|any} and build on it to support group numbers (etc.)
in the future.

One other thing inline below, not related to the immediate subject.

On 04/18/2018 12:40 PM, Jay Pipes wrote:
> On 04/18/2018 11:58 AM, Matt Riedemann wrote:
>> On 4/18/2018 9:06 AM, Jay Pipes wrote:
>>> "By default, should resources/traits submitted in different numbered
>>> request groups be supplied by separate resource providers?"
>>
>> Without knowing all of the hairy use cases, I'm trying to channel my
>> inner sdague and some of the similar types of discussions we've had to
>> changes in the compute API, and a lot of the time we've agreed that we
>> shouldn't assume a default in certain cases.
>>
>> So for this case, if I'm requesting numbered request groups, why
>> doesn't the API just require that I pass a query parameter telling it
>> how I'd like those requests to be handled, either via affinity or
>> anti-affinity
> So, you're thinking maybe something like this?
> 
> 1) Get me two dedicated CPUs. One of those dedicated CPUs must have AVX2
> capabilities. They must be on different child providers (different NUMA
> cells that are providing those dedicated CPUs).
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:1&required1=HW_CPU_X86_AVX2
> &resources2=PCPU:1
> &proximity=isolate:1,2
> 
> 2) Get me four dedicated CPUs. Two of those dedicated CPUs must have
> AVX2 capabilities. Two of the dedicated CPUs must have the SSE 4.2
> capability. They may come from the same provider (NUMA cell) or
> different providers.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2&required1=HW_CPU_X86_AVX2
> &resources2=PCPU:2&required2=HW_CPU_X86_SSE42
> &proximity=any:1,2
> 
> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
> separate physical function providers which have different traits marking
> separate physical networks. The dedicated CPUs must come from the same
> provider tree in which the physical function providers reside.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2
> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
> &proximity=isolate:2,3
> &proximity=same_tree:1,2,3
> 
> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
> separate physical function providers which have different traits marking
> separate physical networks. The dedicated CPUs must come from the same
> provider *subtree* in which the second group of VF resources are sourced.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2
> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
> &proximity=isolate:2,3
> &proximity=same_subtree:1,3

The 'same_subtree' concept requires a way to identify how far up the
common ancestor can be.  Otherwise, *everything* is in the same subtree.
 You could arbitrarily say "one step down from the root", but that's not
very flexible.  Allowing the user to specify a *number* of steps down
from the root is getting closer, but it requires the user to have an
understanding of the provider tree's exact structure, which is not ideal.

The idea I've been toying with here is "common ancestor by trait".  For
example, you would tag your NUMA node providers with trait NUMA_ROOT,
and then your request would include:

  ...
  &proximity=common_ancestor_by_trait:NUMA_ROOT:1,3

> 
> 4) Get me 4 SR-IOV VFs. 2 VFs should be sourced from a provider that is
> decorated with the CUSTOM_PHYSNET_A trait. 2 VFs should be sourced from
> a provider that is decorated with the CUSTOM_PHYSNET_B trait. For HA
> purposes, none of the VFs should be sourced from the same provider.
> However, the VFs for each physical network should be within the same
> subtree (NUMA cell) as each other.
> 
> GET /allocation_candidates?
> 
>  resources1=SRIOV_NET_VF:1&required1=CUSTOM_PHYSNET_A
> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
> &resources4=SRIOV_NET_VF:1&required4=CUSTOM_PHYSNET_B
> &proximity=isolate:1,2,3,4
> &proximity=same_subtree:1,2
> &proximity=same_subtree:3,4
> 
> We can go even deeper if you'd like, since NFV means "never-ending
> feature velocity". Just let me know.
> 
> -jay
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list