[nova] Spec: Standardize CPU resource tracking

Alex Xu soulxu at gmail.com
Mon Jun 17 08:50:35 UTC 2019


Alex Xu <soulxu at gmail.com> 于2019年6月17日周一 下午4:45写道:

> I'm thinking we should have recommended upgrade follow. If we give a lot
> of flexibility for the operator to have a lot combination of the value of
> vcpu_pin_set, dedicated_cpu_set and shared_cpu_set, then we have trouble in
> this email and have to do a lot of checks this email introduced also.
>
> I'm thinking that the pre-request filter (which translates the
> cpu_policy=dedicated to PCPU request) should be enabled after all the node
> upgrades to the Train release. Before that, all the cpu_policy=dedicated
> instance still using the VCPU.
>
> Trying to image the upgrade as below:
>
> 1. Rolling upgrade the compute node.
> 2. The upgraded compute node begins to report both VCPU and PCPU, but
> reshape for the existed inventories.
>      The upgraded node is still using the vcpu_pin_set config, or didn't
> set the vcpu_pin_config. Both in this two cases are reporting VCPU and PCPU
> same time. And the request with cpu_policy=dedicated still uses the VCPU.
> Then it is worked same as Stein release. And existed instance can be
> shelved/unshelved, migration and evacuate.
> 3. Disable the new request and operation for the instance to the hosts for
> dedicated instance. (it is kind of breaking our live-upgrade? I thought
> this will be a short interrupt for the control plane if that is available)
> 4. reshape the inventories for existed instance for all the hosts.
> 5. Enable the instance's new request and operation, also enable the
> pre-request filter.
> 6. Operator copies the value of vcpu_pin_set to dedicated_cpu_set. For the
> case of vcpu_pin_set isn't set, the value of dedicated_cpu_set should be
> all the cpu ids exclude shared_cpu_set if set.
>

I should adjust the order of 4, 5, 6 as below:

4. Operator copies the value of vcpu_pin_set to dedicated_cpu_set. For the
case of vcpu_pin_set isn't set, the value of dedicated_cpu_set should be
all the cpu ids exclude shared_cpu_set if set.
5. the changing of dedicated_cpu_set triggers the reshape of the existed
inventories, and remove the duplicated VCPU resources reporting.
6. Enable the instance's new request and operation, also enable the
pre-request filter.



>
> Two rules at here:
> 1. The operator doesn't allow to change a different value for
> dedicated_cpu_set with vcpu_pin_set when any instance is running on the
> host.
> 2. The operator doesn't allow to change the value of dedicated_cpu_set and
> shared_cpu_set when any instance is running on the host.
>
>
>
> Shewale, Bhagyashri <Bhagyashri.Shewale at nttdata.com> 于2019年6月14日周五
> 下午4:42写道:
>
>> >> that is incorrect both a and by will be returned. the spec states that
>> for host A we report an inventory of 4 VCPUs and
>>
>> >> an inventory of 4 PCPUs and host B will have 1 inventory of 4 PCPUs so
>> both host will be returned assuming
>>
>> >> $<no. of cpus> <=4
>>
>>
>> Means if ``vcpu_pin_set`` is set in previous release then report both
>> VCPU and PCPU as inventory (in Train) but this seems contradictory for
>> example:
>>
>>
>> On Stein,
>>
>>
>> Configuration on compute node A:
>>
>> vcpu_pin_set=0-3 (This will report 4 VCPUs inventory in placement
>> database)
>>
>>
>> On Train:
>>
>> vcpu_pin_set=0-3
>>
>>
>> The inventory will be reported as 4 VCPUs and 4 PCPUs in the placement db
>>
>>
>> Now say user wants to create instances as below:
>>
>>    1. Flavor having extra specs (resources:PCPU=1), instance A
>>    2. Flavor having extra specs (resources:VCPU=1), instance B
>>
>>
>> For both instance requests, placement will return compute Node A.
>>
>> Instance A:  will be pinned to say 0 CPU
>>
>> Instance B:  will float on 0-3
>>
>>
>> To resolve above issue, I think it’s possible to detect whether the
>> compute node was configured to be used for pinned instances if
>> ``NumaTopology`` ``pinned_cpus`` attribute is not empty. In that case,
>> vcpu_pin_set will be reported as PCPU otherwise VCPU.
>>
>>
>> Regards,
>>
>> -Bhagyashri Shewale-
>>
>> ------------------------------
>> *From:* Sean Mooney <smooney at redhat.com>
>> *Sent:* Thursday, June 13, 2019 8:32:02 PM
>> *To:* Shewale, Bhagyashri; openstack-discuss at lists.openstack.org;
>> openstack at fried.cc; sfinucan at redhat.com; jaypipes at gmail.com
>> *Subject:* Re: [nova] Spec: Standardize CPU resource tracking
>>
>> On Wed, 2019-06-12 at 09:10 +0000, Shewale, Bhagyashri wrote:
>> > Hi All,
>> >
>> >
>> > Currently I am working on implementation of cpu pinning upgrade part as
>> mentioned in the spec [1].
>> >
>> >
>> > While implementing the scheduler pre-filter as mentioned in [1], I have
>> encountered one big issue:
>> >
>> >
>> > Proposed change in spec: In scheduler pre-filter we are going to alias
>> request_spec.flavor.extra_spec and
>> > request_spec.image.properties form ``hw:cpu_policy`` to
>> ``resources=(V|P)CPU:${flavor.vcpus}`` of existing instances.
>> >
>> >
>> > So when user will create a new instance  or execute instance actions
>> like shelve, unshelve, resize, evacuate and
>> > migration  post upgrade it will go through scheduler pre-filter which
>> will set alias for `hw:cpu_policy` in
>> > request_spec flavor ``extra specs`` and image metadata properties. In
>> below particular case, it won’t work:-
>> >
>> >
>> > For example:
>> >
>> >
>> > I have two compute nodes say A and B:
>> >
>> >
>> > On Stein:
>> >
>> >
>> > Compute node A configurations:
>> >
>> > vcpu_pin_set=0-3 (used as dedicated CPU, This host is added in
>> aggregate which has “pinned” metadata)
>> vcpu_pin_set does not mean that the host was used for pinned instances
>> https://that.guru/blog/cpu-resources/
>> >
>> >
>> > Compute node B Configuration:
>> >
>> > vcpu_pin_set=0-3 (used as dedicated CPU, This host is added in
>> aggregate which has “pinned” metadata)
>> >
>> >
>> > On Train, two possible scenarios:
>> >
>> > Compute node A configurations: (Consider the new cpu pinning
>> implementation is merged into Train)
>> >
>> > vcpu_pin_set=0-3  (Keep same settings as in Stein)
>> >
>> >
>> > Compute node B Configuration: (Consider the new cpu pinning
>> implementation is merged into Train)
>> >
>> > cpu_dedicated_set=0-3 (change to the new config option)
>> >
>> >   1.  Consider that one instance say `test ` is created using flavor
>> having old extra specs (hw:cpu_policy=dedicated,
>> > "aggregate_instance_extra_specs:pinned": "true") in Stein release and
>> now upgraded Nova to Train with the above
>> > configuration.
>> >   2.  Now when user will perform  instance action say shelve/unshelve
>> scheduler pre-filter will change the
>> > request_spec flavor extra spec from ``hw:cpu_policy`` to
>> ``resources=PCPU:$<no. of cpus>``
>> it wont remove hw:cpu_policy it will just change the resouces=VCPU:$<no.
>> of cpus> ->   resources=PCPU:$<no. of cpus>
>>
>> >  which ultimately will return only compute node B from placement
>> service.
>> that is incorrect both a and by will be returned. the spec states that
>> for host A we report an inventory of 4 VCPUs and
>> an inventory of 4 PCPUs and host B will have 1 inventory of 4 PCPUs so
>> both host will be returned assuming
>> $<no. of cpus> <=4
>>
>> >  Here, we expect it should have retuned both Compute A and Compute B.
>> it will
>> >   3.  If user creates a new instance using old extra specs
>> (hw:cpu_policy=dedicated,
>> > "aggregate_instance_extra_specs:pinned": "true") on Train release  with
>> the above configuration then it will return
>> > only compute node B from placement service where as it should have
>> returned both compute Node A and B.
>> that is what would have happend in the stien version of the spec and we
>> changed the spec specifically to ensure that
>> that wont happen. in the train version of the spec you will get both host
>> as candates to prevent this upgrade impact.
>> >
>> > Problem: As Compute node A is still configured to be used to boot
>> instances with dedicated CPUs same behavior as
>> > Stein, it will not be returned by placement service due to the changes
>> in the scheduler pre-filter logic.
>> >
>> >
>> > Propose changes:
>> >
>> >
>> > Earlier in the spec [2]: The online data migration was proposed to
>> change flavor extra specs and image metadata
>> > properties of request_spec and instance object. Based on the instance
>> host, we can get the NumaTopology of the host
>> > which will contain the new configuration options set on the compute
>> host. Based on the NumaTopology of host, we can
>> > change instance and request_spec flavor extra specs.
>> >
>> >   1.  Remove cpu_policy from extra specs
>> >   2.  Add “resources:PCPU=<count>” in extra specs
>> >
>> >
>> > We can also change the flavor extra specs and image metadata properties
>> of instance and request_spec object using the
>> > reshape functionality.
>> >
>> >
>> > Please give us your feedback on the proposed solution so that we can
>> update specs accordingly.
>> i am fairly stongly opposed to useing an online data migration to modify
>> the request spec to reflect the host they
>> landed on. this speficic problem is why the spec was changed in the train
>> cycle to report dual inventoryis of VCPU and
>> PCPU if vcpu_pin_set is the only option set or of no options are set.
>> >
>> >
>> > [1]:
>> https://review.opendev.org/#/c/555081/28/specs/train/approved/cpu-resources.rst@451
>> >
>> > [2]:
>> https://review.opendev.org/#/c/555081/23..28/specs/train/approved/cpu-resources.rst
>> >
>> >
>> > Thanks and Regards,
>> >
>> > -Bhagyashri Shewale-
>> >
>> > Disclaimer: This email and any attachments are sent in strictest
>> confidence for the sole use of the addressee and may
>> > contain legally privileged, confidential, and proprietary data. If you
>> are not the intended recipient, please advise
>> > the sender by replying promptly to this email and then delete and
>> destroy this email and any attachments without any
>> > further use, copying or forwarding.
>>
>> Disclaimer: This email and any attachments are sent in strictest
>> confidence for the sole use of the addressee and may contain legally
>> privileged, confidential, and proprietary data. If you are not the intended
>> recipient, please advise the sender by replying promptly to this email and
>> then delete and destroy this email and any attachments without any further
>> use, copying or forwarding.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190617/d05a506b/attachment-0001.html>


More information about the openstack-discuss mailing list