Open Stack

Thu Sep 27 18:16:51 UTC 2018

On 09/27/2018 11:15 AM, Eric Fried wrote:
> On 09/27/2018 07:37 AM, Matt Riedemann wrote:
>> On 9/27/2018 5:23 AM, Sylvain Bauza wrote:
>>>
>>>
>>> On Thu, Sep 27, 2018 at 2:46 AM Matt Riedemann <mriedemos at gmail.com
>>> <mailto:mriedemos at gmail.com>> wrote:
>>>
>>>      On 9/26/2018 5:30 PM, Sylvain Bauza wrote:
>>>       > So, during this day, we also discussed about NUMA affinity and we
>>>      said
>>>       > that we could possibly use nested resource providers for NUMA
>>>      cells in
>>>       > Stein, but given we don't have yet a specific Placement API
>>>      query, NUMA
>>>       > affinity should still be using the NUMATopologyFilter.
>>>       > That said, when looking about how to use this filter for vGPUs,
>>>      it looks
>>>       > to me that I'd need to provide a new version for the NUMACell
>>>      object and
>>>       > modify the virt.hardware module. Are we also accepting this
>>>      (given it's
>>>       > a temporary question), or should we need to wait for the
>>>      Placement API
>>>       > support ?
>>>       >
>>>       > Folks, what are you thoughts ?
>>>
>>>      I'm pretty sure we've said several times already that modeling
>>> NUMA in
>>>      Placement is not something for which we're holding up the extraction.
>>>
>>>
>>> It's not an extraction question. Just about knowing whether the Nova
>>> folks would accept us to modify some o.vo object and module just for a
>>> temporary time until Placement API has some new query parameter.
>>> Whether Placement is extracted or not isn't really the problem, it's
>>> more about the time it will take for this query parameter ("numbered
>>> request groups to be in the same subtree") to be implemented in the
>>> Placement API.
>>> The real problem we have with vGPUs is that if we don't have NUMA
>>> affinity, the performance would be around 10% less for vGPUs (if the
>>> pGPU isn't on the same NUMA cell than the pCPU). Not sure large
>>> operators would accept that :(
>>>
>>> -Sylvain
>>
>> I don't know how close we are to having whatever we need for modeling
>> NUMA in the placement API, but I'll go out on a limb and assume we're
>> not close.
> 
> True story. We've been talking about ways to do this since (at least)
> the Queens PTG, but haven't even landed on a decent design, let alone
> talked about getting it specced, prioritized, and implemented. Since
> full NRP support was going to be a prerequisite in any case, and our
> Stein plate is full, Train is the earliest we could reasonably expect to
> get the placement support going, let alone the nova side. So yeah...
> 
>> Given that, if we have to do something within nova for NUMA
>> affinity for vGPUs for the NUMATopologyFilter, then I'd be OK with that
>> since it's short term like you said (although our "short term"
>> workarounds tend to last for many releases). Anyone that cares about
>> NUMA today already has to enable the scheduler filter anyway.
>>
> 
> +1 to this ^

Or, I don't know, maybe don't do anything and deal with the (maybe) 10% 
performance impact from the cross-NUMA main memory <-> CPU hit for 
post-processing of already parallel-processed GPU data.

In other words, like I've mentioned in numerous specs and in person, I 
really don't think this is a major problem and is mostly something we're 
making a big deal about for no real reason.

-jay

Open Stack

[openstack-dev] [nova] Stein PTG summary

OpenStack

Community

Documentation

Branding & Legal