Re: [nova] Spec: Standardize CPU resource tracking

18 Jun 2019

      Stephen Finucane <sfinucan@redhat.com> 于2019年6月18日周二 下午5:55写道：
...
On Tue, 2019-06-18 at 06:41 +0000, Shewale, Bhagyashri wrote:
...
...
As above, ignore 'cpu_shared_set' but issue a warning. Use the value of
‘vcpu_pin_set' to report both VCPU and PCPU inventory. Note that
‘vcpu_pin_set' is already used to calculate VCPU inventory.
As mentioned in the spec, If operator sets the ``vcpu_pin_set`` in
the Stein and upgrade to Train then both VCPU and PCPU inventory
should be reported in placement.
As on current master (Stein) if operator sets ``vpcu_pin_set=0-3`` on
Compute node A and adds that node A into the host aggregate say
“agg1” having metadata ``pinned=true``, then it allows to create
both pinned and non-pinned instances which is known big issue.
Create instance A having flavor extra specs
("aggregate_instance_extra_specs:pinned": "true") then instance A
will float on cpus 0-3
Create the instance B having flavor extra specs
("aggregate_instance_extra_specs:pinned": "true", "hw:cpu_policy":
"dedicated") then instance B will be pinned to one of the cpu say 0.
Now, operator will do the upgrade (Stein to Train), nova compute will
report both VCPU and PCPU inventory. In this case if
cpu_allocation_ratio is 1, then total PCPU available will be 4
(vpcu_pin_set=0-3) and VCPU will also be 4. And this will allow user
to create maximum of 4 instances with flavor extra specs
``resources:PCPU=1`` and 4 instances with flavor extra specs
``resources:VCPU=1``.
If the cpu_allocation_ratio is 1.0 then yes, this is correct. However,
if it's any greater (and remember, the default is 16.0) then the gap is
much smaller, though still broken.
...
With current master code, it’s possible to create only 4 instances
where now, by reporting both VCPU and PCPU, it will allow user to
create total of 8 instances which is adding another level of problem
along with the existing known issue. Is this acceptable?  because
this is decorating the problems.
I think is acceptable, yes. As we've said, this is broken behavior and
things are just slightly more broken here, though not horribly so. As
it stands, if you don't isolate pinned instances from non-pinned
instances, you don't get any of the guarantees pinning is supposed to
provide. Using the above example, if you booted two pinned and two
unpinned instances on the same host, the unpinned instances would float
over the pinned instances' cores [*] and impact their performance. If
performance is an issue, host aggregrates will have been used.
[*] They'll actually float over the entire range of host cores since
instnace without a NUMA topology don't respect the 'vcpu_pin_set'
value.
Yes, agree with Stephen, we don't suggest the user mix the pin and non-pin
instance on the same host with current master.

If user want to mix pin and non-pin instance, the user need update his
configuration to use dedicated_cpu_set and shared_cpu_set.
The vcpu_pin_set reports VCPU and PCPU inventories is the intermediate
status. In that intermediate status, the operator still need to separate
the pin and non-pin instance into different host.
...
...
If not acceptable, then we can report  only PCPU in this case which
will solve two problems:-
The existing known issue on current master (allowing both pinned and
non-pinned instances) on the compute host meant for pinning.
Above issue of allowing 8 instances to be created on the host.
But there is one problem in taking this decision, if no instances are
running on the compute node in case only ``vcpu_pinned_set`` is set,
how do you find out this compute node is configured to create pinned
or non-pinned instances? If instances are running, based on the Host
numa_topology.pinned_cpus, it’s possible to detect that.
As noted previously, this is too complex and too error prone. Let's
just suffer the potential additional impact on performance for those
who haven't correctly configured their deployment, knowing that as soon
as they get to U, where we can require the 'cpu_dedicated_set' and
'cpu_shared_set' options if you want to use pinned instances, things
will be fixed.
Stephen

Re: [nova] Spec: Standardize CPU resource tracking

Alex Xu