[nova] Spec: Standardize CPU resource tracking

Stephen Finucane sfinucan at redhat.com
Mon Jun 17 10:19:57 UTC 2019

On Fri, 2019-06-14 at 08:37 +0000, Shewale, Bhagyashri wrote:
> >> that is incorrect both a and by will be returned. the spec states
> that for host A we report an inventory of 4 VCPUs and
> >> an inventory of 4 PCPUs and host B will have 1 inventory of 4
> PCPUs so both host will be returned assuming
> >> $<no. of cpus> <=4
> Means if ``vcpu_pin_set`` is set in previous release then report both
> VCPU and PCPU as inventory (in Train) but this seems contradictory
> for example:
> On Stein, 
> Configuration on compute node A:
> vcpu_pin_set=0-3 (This will report 4 VCPUs inventory in placement
> database)
> On Train:
> vcpu_pin_set=0-3
> The inventory will be reported as 4 VCPUs and 4 PCPUs in the
> placement db
> Now say user wants to create instances as below:
> Flavor having extra specs (resources:PCPU=1), instance A
> Flavor having extra specs (resources:VCPU=1), instance B
> For both instance requests, placement will return compute Node A.
> Instance A:  will be pinned to say 0 CPU
> Instance B:  will float on 0-3

This is not a serious issue. This is very similar to what will happen
today if you don't use host aggregrates to isolate NUMA-based instances
from non-NUMA-based instances. If you can assume that operators are
using host aggregates to separate pinned and unpinned instance, then
the VCPU inventory of a host in the 'pinned' aggregrate will never be
consumed and vice versa.

> To resolve above issue, I think it’s possible to detect whether the
> compute node was configured to be used for pinned instances if
> ``NumaTopology`` ``pinned_cpus`` attribute is not empty. In that
> case, vcpu_pin_set will be reported as PCPU otherwise VCPU.

This only works if the host already has instances on it. If you've a
deployment with 100 hosts and 82 of them have instances on there at the
time of upgrade, then 82 will start reporting PCPU inventory and 18
will continue reporting just VCPU inventory. We thought long and hard
about this and there is no good heuristic we can use to separate hosts
that should report PCPUs from those that should report VCPUs. That's
why we said we'll report both and hope that host aggregrates are
configured correctly. If host aggregrates aren't configured, then
things are no more broken than before but at least the operator will
now get warnings (above missing 'cpu_dedicated_set' options).

As before, please push some of this code up so we can start reviewing


