[openstack-dev] [placement][nova] Decision time on granular request groups for like resources

Jay Pipes jaypipes at gmail.com
Wed Apr 18 22:20:14 UTC 2018


On 04/18/2018 04:52 PM, Eric Fried wrote:
> I can't tell if you're being facetious, but this seems sane, albeit
> complex.  It's also extensible as we come up with new and wacky affinity
> semantics we want to support.

I was not being facetious.

> I can't say I'm sold on requiring `proximity` qparams that cover every
> granular group - that seems like a pretty onerous burden to put on the
> user right out of the gate.

I did that because Matt said he wanted no default/implicit behaviour -- 
everything should be explicit.

 > That said, the idea of not having a default
> is quite appealing.  Perhaps as a first pass we can require a single
> ?proximity={isolate|any} and build on it to support group numbers (etc.)
> in the future.

Here's my problem.

I have a feeling we're just going to go back and forth on this, as we 
have for weeks now, and not reach any conclusion that is satisfactory to 
everyone. And we'll delay, yet again, getting functionality into this 
release that serves 90% of use cases because we are obsessing over the 
0.01% of use cases that may pop up later.

Best,
-jay

> One other thing inline below, not related to the immediate subject.
> 
> On 04/18/2018 12:40 PM, Jay Pipes wrote:
>> On 04/18/2018 11:58 AM, Matt Riedemann wrote:
>>> On 4/18/2018 9:06 AM, Jay Pipes wrote:
>>>> "By default, should resources/traits submitted in different numbered
>>>> request groups be supplied by separate resource providers?"
>>>
>>> Without knowing all of the hairy use cases, I'm trying to channel my
>>> inner sdague and some of the similar types of discussions we've had to
>>> changes in the compute API, and a lot of the time we've agreed that we
>>> shouldn't assume a default in certain cases.
>>>
>>> So for this case, if I'm requesting numbered request groups, why
>>> doesn't the API just require that I pass a query parameter telling it
>>> how I'd like those requests to be handled, either via affinity or
>>> anti-affinity
>> So, you're thinking maybe something like this?
>>
>> 1) Get me two dedicated CPUs. One of those dedicated CPUs must have AVX2
>> capabilities. They must be on different child providers (different NUMA
>> cells that are providing those dedicated CPUs).
>>
>> GET /allocation_candidates?
>>
>>   resources1=PCPU:1&required1=HW_CPU_X86_AVX2
>> &resources2=PCPU:1
>> &proximity=isolate:1,2
>>
>> 2) Get me four dedicated CPUs. Two of those dedicated CPUs must have
>> AVX2 capabilities. Two of the dedicated CPUs must have the SSE 4.2
>> capability. They may come from the same provider (NUMA cell) or
>> different providers.
>>
>> GET /allocation_candidates?
>>
>>   resources1=PCPU:2&required1=HW_CPU_X86_AVX2
>> &resources2=PCPU:2&required2=HW_CPU_X86_SSE42
>> &proximity=any:1,2
>>
>> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
>> separate physical function providers which have different traits marking
>> separate physical networks. The dedicated CPUs must come from the same
>> provider tree in which the physical function providers reside.
>>
>> GET /allocation_candidates?
>>
>>   resources1=PCPU:2
>> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
>> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
>> &proximity=isolate:2,3
>> &proximity=same_tree:1,2,3
>>
>> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
>> separate physical function providers which have different traits marking
>> separate physical networks. The dedicated CPUs must come from the same
>> provider *subtree* in which the second group of VF resources are sourced.
>>
>> GET /allocation_candidates?
>>
>>   resources1=PCPU:2
>> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
>> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
>> &proximity=isolate:2,3
>> &proximity=same_subtree:1,3
> 
> The 'same_subtree' concept requires a way to identify how far up the
> common ancestor can be.  Otherwise, *everything* is in the same subtree.
>   You could arbitrarily say "one step down from the root", but that's not
> very flexible.  Allowing the user to specify a *number* of steps down
> from the root is getting closer, but it requires the user to have an
> understanding of the provider tree's exact structure, which is not ideal.
> 
> The idea I've been toying with here is "common ancestor by trait".  For
> example, you would tag your NUMA node providers with trait NUMA_ROOT,
> and then your request would include:
> 
>    ...
>    &proximity=common_ancestor_by_trait:NUMA_ROOT:1,3
> 
>>
>> 4) Get me 4 SR-IOV VFs. 2 VFs should be sourced from a provider that is
>> decorated with the CUSTOM_PHYSNET_A trait. 2 VFs should be sourced from
>> a provider that is decorated with the CUSTOM_PHYSNET_B trait. For HA
>> purposes, none of the VFs should be sourced from the same provider.
>> However, the VFs for each physical network should be within the same
>> subtree (NUMA cell) as each other.
>>
>> GET /allocation_candidates?
>>
>>   resources1=SRIOV_NET_VF:1&required1=CUSTOM_PHYSNET_A
>> &resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
>> &resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
>> &resources4=SRIOV_NET_VF:1&required4=CUSTOM_PHYSNET_B
>> &proximity=isolate:1,2,3,4
>> &proximity=same_subtree:1,2
>> &proximity=same_subtree:3,4
>>
>> We can go even deeper if you'd like, since NFV means "never-ending
>> feature velocity". Just let me know.
>>
>> -jay
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list