[nova] Spec: Standardize CPU resource tracking
Alex Xu
soulxu at gmail.com
Tue Jun 18 13:33:00 UTC 2019
Stephen Finucane <sfinucan at redhat.com> 于2019年6月18日周二 下午5:55写道:
> On Tue, 2019-06-18 at 06:41 +0000, Shewale, Bhagyashri wrote:
> > > As above, ignore 'cpu_shared_set' but issue a warning. Use the value of
> > > ‘vcpu_pin_set' to report both VCPU and PCPU inventory. Note that
> > > ‘vcpu_pin_set' is already used to calculate VCPU inventory.
> >
> > As mentioned in the spec, If operator sets the ``vcpu_pin_set`` in
> > the Stein and upgrade to Train then both VCPU and PCPU inventory
> > should be reported in placement.
> >
> > As on current master (Stein) if operator sets ``vpcu_pin_set=0-3`` on
> > Compute node A and adds that node A into the host aggregate say
> > “agg1” having metadata ``pinned=true``, then it allows to create
> > both pinned and non-pinned instances which is known big issue.
> > Create instance A having flavor extra specs
> > ("aggregate_instance_extra_specs:pinned": "true") then instance A
> > will float on cpus 0-3
> > Create the instance B having flavor extra specs
> > ("aggregate_instance_extra_specs:pinned": "true", "hw:cpu_policy":
> > "dedicated") then instance B will be pinned to one of the cpu say 0.
> > Now, operator will do the upgrade (Stein to Train), nova compute will
> > report both VCPU and PCPU inventory. In this case if
> > cpu_allocation_ratio is 1, then total PCPU available will be 4
> > (vpcu_pin_set=0-3) and VCPU will also be 4. And this will allow user
> > to create maximum of 4 instances with flavor extra specs
> > ``resources:PCPU=1`` and 4 instances with flavor extra specs
> > ``resources:VCPU=1``.
>
> If the cpu_allocation_ratio is 1.0 then yes, this is correct. However,
> if it's any greater (and remember, the default is 16.0) then the gap is
> much smaller, though still broken.
>
> > With current master code, it’s possible to create only 4 instances
> > where now, by reporting both VCPU and PCPU, it will allow user to
> > create total of 8 instances which is adding another level of problem
> > along with the existing known issue. Is this acceptable? because
> > this is decorating the problems.
>
> I think is acceptable, yes. As we've said, this is broken behavior and
> things are just slightly more broken here, though not horribly so. As
> it stands, if you don't isolate pinned instances from non-pinned
> instances, you don't get any of the guarantees pinning is supposed to
> provide. Using the above example, if you booted two pinned and two
> unpinned instances on the same host, the unpinned instances would float
> over the pinned instances' cores [*] and impact their performance. If
> performance is an issue, host aggregrates will have been used.
>
> [*] They'll actually float over the entire range of host cores since
> instnace without a NUMA topology don't respect the 'vcpu_pin_set'
> value.
>
Yes, agree with Stephen, we don't suggest the user mix the pin and non-pin
instance on the same host with current master.
If user want to mix pin and non-pin instance, the user need update his
configuration to use dedicated_cpu_set and shared_cpu_set.
The vcpu_pin_set reports VCPU and PCPU inventories is the intermediate
status. In that intermediate status, the operator still need to separate
the pin and non-pin instance into different host.
>
> > If not acceptable, then we can report only PCPU in this case which
> > will solve two problems:-
> > The existing known issue on current master (allowing both pinned and
> > non-pinned instances) on the compute host meant for pinning.
> > Above issue of allowing 8 instances to be created on the host.
> > But there is one problem in taking this decision, if no instances are
> > running on the compute node in case only ``vcpu_pinned_set`` is set,
> > how do you find out this compute node is configured to create pinned
> > or non-pinned instances? If instances are running, based on the Host
> > numa_topology.pinned_cpus, it’s possible to detect that.
>
> As noted previously, this is too complex and too error prone. Let's
> just suffer the potential additional impact on performance for those
> who haven't correctly configured their deployment, knowing that as soon
> as they get to U, where we can require the 'cpu_dedicated_set' and
> 'cpu_shared_set' options if you want to use pinned instances, things
> will be fixed.
>
> Stephen
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190618/ebe1d65d/attachment.html>
More information about the openstack-discuss
mailing list