[nova] Spec: Standardize CPU resource tracking

Shewale, Bhagyashri Bhagyashri.Shewale at nttdata.com
Tue Jun 18 06:41:26 UTC 2019


>> As above, ignore 'cpu_shared_set' but issue a warning. Use the value of

>> ‘vcpu_pin_set' to report both VCPU and PCPU inventory. Note that

>> ‘vcpu_pin_set' is already used to calculate VCPU inventory.


As mentioned in the spec, If operator sets the ``vcpu_pin_set`` in the Stein and upgrade to Train then both VCPU and PCPU inventory should be reported in placement.


As on current master (Stein) if operator sets ``vpcu_pin_set=0-3`` on Compute node A and adds that node A into the host aggregate say “agg1” having metadata ``pinned=true``, then it allows to create both pinned and non-pinned instances which is known big issue.

  1.  Create instance A having flavor extra specs ("aggregate_instance_extra_specs:pinned": "true") then instance A will float on cpus 0-3
  2.  Create the instance B having flavor extra specs ("aggregate_instance_extra_specs:pinned": "true", "hw:cpu_policy": "dedicated") then instance B will be pinned to one of the cpu say 0.

Now, operator will do the upgrade (Stein to Train), nova compute will report both VCPU and PCPU inventory. In this case if cpu_allocation_ratio is 1, then total PCPU available will be 4 (vpcu_pin_set=0-3) and VCPU will also be 4. And this will allow user to create maximum of 4 instances with flavor extra specs ``resources:PCPU=1`` and 4 instances with flavor extra specs ``resources:VCPU=1``.


With current master code, it’s possible to create only 4 instances where now, by reporting both VCPU and PCPU, it will allow user to create total of 8 instances which is adding another level of problem along with the existing known issue. Is this acceptable?  because this is decorating the problems.


If not acceptable, then we can report  only PCPU in this case which will solve two problems:-

  1.  The existing known issue on current master (allowing both pinned and non-pinned instances) on the compute host meant for pinning.
  2.  Above issue of allowing 8 instances to be created on the host.

But there is one problem in taking this decision, if no instances are running on the compute node in case only ``vcpu_pinned_set`` is set, how do you find out this compute node is configured to create pinned or non-pinned instances? If instances are running, based on the Host numa_topology.pinned_cpus, it’s possible to detect that.


Regards,

Bhagyashri Shewale

________________________________
From: Stephen Finucane <sfinucan at redhat.com>
Sent: Monday, June 17, 2019 7:10:28 PM
To: Shewale, Bhagyashri; openstack-discuss at lists.openstack.org
Subject: Re: [nova] Spec: Standardize CPU resource tracking

[Cleaning up the 'To' field since Jay isn't working on OpenStack
anymore and everyone else is on openstack-discuss already]

On Fri, 2019-06-14 at 08:35 +0000, Shewale, Bhagyashri wrote:
> > cpu_share_set in stien was used for vm emulator thread and required
> > the instnace to be pinned for it to take effect. i.e. the
> > hw:emulator_thread_policy extra spcec currently only works if you
> > had hw_cpu_policy=dedicated so we should not error if vcpu_pin_set
> > and cpu_shared_set are defined, it was valid. what we can do is
> > ignore teh cpu_shared_set for schduling and not report 0 VCPUs
> > for this host and use vcpu_pinned_set as PCPUs.

> Thinking of backward compatibility, I agree both of these
> configuration options ``cpu_shared_set``, ``vcpu_pinned_set`` should
> be allowed in Train release as well.
>
> Few possible combinations in train:
> A) What if only ``cpu_shared_set`` is set on a new compute node?
> Report VCPU inventory.

I think this is _very_ unlikely to happen in the real world since the
lack of a 'vcpu_pin_set' option means an instances pinned CPUs could
co-exist on the same cores as the emulator threats, which defeats the
whole point of placing emulator threads on a separate core. That said,
it's possible so we do have to deal with it.

Ignore 'cpu_shared_set' in this case and issue a warning saying that
the user has to configure 'cpu_dedicated_set'.

> B) what if  ``cpu_shared_set`` and ``cpu_dedicated_set``  are set on
> a new compute node?  Report VCPU  and PCPU inventory. In fact, we
> want to support both these options so that instance can request both
> VCPU and PCPU at the same time. If flavor requests VCPU or
> hw:emulator_thread_policy=share, in both the cases, it will float on
> CPUs set in ``cpu_shared_set`` config option.

We should report both VCPU and PCPU inventory, yes. However, please
don't add the ability to create a single instance with combined VCPU
and PCPU inventory. I dropped this from the spec intentionally to make
it easier for something (_anything_) to land. We can iterate on this
once we have the basics done.

> C) What if  ``cpu_shared_set`` and ``vcpu_pin_set``  are set on a new
> compute node?  Ignore cpu_shared_set and report vcpu_pinned_set as
> VCPU or PCPU?

As above, ignore 'cpu_shared_set' but issue a warning. Use the value of
'vcpu_pin_set' to report both VCPU and PCPU inventory. Note that
'vcpu_pin_set' is already used to calculate VCPU inventory.

https://opendev.org/openstack/nova/src/branch/master/nova/virt/libvirt/driver.py#L5808-L5811

> D) What if  ``cpu_shared_set`` and ``vcpu_pin_set``  are set on a
> upgraded compute node? As you have mentioned, ignore cpu_shared_set
> and report vcpu_pinned_set as PCPUs provided  ``NumaTopology``
> ,``pinned_cpus`` attribute  is not empty otherwise VCPU.

Ignore 'cpu_shared_set' but issue a warning. Use the value of
'vcpu_pin_set' to report both VCPU and PCPU inventory. Note that
'vcpu_pin_set' is already used to calculate VCPU inventory.

> > we explctly do not want to have the behavior in 3 and 4 specificly
> > the logic of checking the instances.
>
> Here we are checking Host ``NumaTopology`` ,``pinned_cpus``
> attribute  and not directly instances ( if that attribute is not
> empty that means some instance are running) and this logic will be
> needed to address above #D case.

You shouldn't need to do this. Rely solely on configuration options to
determine inventory, even if it means reporting more inventory than we
actually have (reporting of a host core as both units of VCPU and PCPU)
and hope that operators have correctly used host aggregrates to isolate
NUMA-based instances from non-NUMA-based instances.

I realize this is very much in flux but could you please push what you
have up for review, marked as WIP or such. Debating this stuff in the
code might be easier.

Stephen

Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190618/c329dac9/attachment.html>


More information about the openstack-discuss mailing list