Eric Fried <openstack@fried.cc> 于2019年4月17日周三 上午3:28写道:
I'm not sure I understand your proposal. Would you introduce a VM resource and then allocate 1 of that resource for each VM?
This has been proposed before, somewhere: translating max_instances_per_host to an inventory of resource class "VM" on the compute node RP, and including resources:VM=1 in every GET /a_c request.
Actually, I propose to attach traits to the RP which has the VCPU resource. In the case, we have NUMA in placement. We will attach traits to the numa node RP. I just try to explain why that may makes sense. Since the those traits should be attached to the "Compute" resource, and the VCPU is just that "Compute" resource.(yes, we have two numa nodes, then both two numa nodes has those traits, but it should be fine). When we are asking those traits, then we must asking the VCPU, right? If so, it sounds make sense. Or is there any case we only request a trait, totally no resource requesting? If yes, that may not works. But I think the case we begin to discussion is about the trait and resource aren't in the same RP. For the neutron bw case, we still attach nic type trait to the PF, not the agent, for the same reason.
This would solve the class of use cases like:
So we have a specific use case: COMPUTE_TRUSTED_CERTS + NUMA
but wouldn't help us for:
So what we need to solve is. Two (or more) sets of resources where the different sets requires different, contradicting traits, in a setup where the trait is not on the RP where resource inventory is.
compute RP | | |____ OVS agent RP | | * CUSTOM_VNIC_TYPE_NORMAL | | | |___________ br-int dev RP | * CUSTOM_PHYSNET_PHYSNET0 | * NET_BW_EGR_KILOBIT_PER_SEC: 1000 | | |____ SRIOV agent RP | | * CUSTOM_VNIC_TYPE_DIRECT | | | | | |___________ esn1 dev RP | | * CUSTOM_PHYSNET_PHYSNET0 | | * NET_BW_EGR_KILOBIT_PER_SEC: 10000 | | | |___________ esn2 dev RP | * CUSTOM_PHYSNET_PHYSNET1 | * NET_BW_EGR_KILOBIT_PER_SEC: 20000
Then having two neutron ports in a server create request: * port-normal: "resource_request": { "resources": { orc.NET_BW_EGR_KILOBIT_PER_SEC: 1000}, "required": ["CUSTOM_PHYSNET0", "CUSTOM_VNIC_TYPE_NORMAL"]
* port-direct: "resource_request": { "resources": { orc.NET_BW_EGR_KILOBIT_PER_SEC: 2000}, "required": ["CUSTOM_PHYSNET0", "CUSTOM_VNIC_TYPE_DIRECT"]
...unless we contrive some inventory unit to put on the agent RPs. What would that be? VNIC? How would we know how many to create?
Interestingly, the above is closely approaching the space we're exploring for "subtree affinity". I'm wondering if there's a Unified Solution...
yea, if we are going to create a virtual resource 'VM', then we need 'VNIC', and then we need more. I don't like that.
For example, if we said that "traits always flow down [4]" (the phrase that entered my brain and got me to start this email, "down" in this case is "in the direction of children") then some traits could be on the compute node, but expressed in a numbered request group if that happened to be more convenient.
This mental model works well for me, because nested often represents a _containing_ hierarchy [2].
If the "compute RP has no resources to give [...] but it's still the thing exposing traits we want to filter by" [3], if we make it so the children inherit those traits (because they have flowed down and the children are "inside" the thing) things feel a bit saner to me. Would be good if Eric were able to express in more detail why inherit feels "terrible" [3]. It could very well be.
I also said "feels". I can't really explain it any better than I could explain why "using group numbers as values" gave me the ooks. And given we're coming up ugly with all other proposals, convince me that this one is practical and not fraught with peril and I'll quickly get over my discomfort. Right now I'm pretty close to that point because it elegantly solves both classes of problem described above, and I can't think of a way to break it that isn't ridiculously contrived.
It's possible we punted on it before because a) we didn't have the concrete use cases we have now; and b) it was going to be pretty tricky to implement. More on that below.
Similarly, aggregate membership would flow down as well, because a child is always in its parent's aggregate too because it is inside its parent.
This one I'm not so convinced about. Can we defer making changes here until we have similarly concrete use cases?
A numeric requiredN or member_ofN span would be capped by the resource provider that satisfied resourcesN.
Eh? I was following you up to this point. Do you just mean that we don't have to worry about ascending the tree looking for requiredN because the trait is implicitly on the provider with resourceN by virtue of being on its ancestor?
We need to work out a consistent and relatively easy to explain mental model for this, because we need to be able to talk about it with other people without them being required to re-experience all the mental hurdles we are having to overcome.
I think the hurdles are more around "why" and "are you sure you want to" - once we've made those decisions, IMO it can be understood fairly easily with one or both of "encapsulation" and "traits flow down" as you've explained them.
[4] A corollary could be "classes of inventory always flow up": If you need a SRIOV_NET_VF, this root resource provider can provide it because it has a great grandchild which has it.
This one bakes my noodle pretty good. I have a harder time visualizing how the above use cases are satisfied by walking backwards up the tree accumulating resources (and you have to accumulate the traits as well, right?) until I hit a point where I've gathered everything I need.
So I'll come down in favor of making "traits flow down" happen. Question is, how? (And I know we've talked about this before - maybe Queens+Denver?)
(A) In the database. (i) Any time a trait is added to a provider, we create records for same trait for all descendants. (ii) Need a data migration to bring existing data into conformance with ^ (iii) When a trait is deleted from a provider, I assume we need to recursively delete it from all descendants. If you didn't want that, you'd have to go back and re-add it to the descendants you wanted it on.
Pros: Easy to do. We don't have to change any of the APIs' algorithms - they just work the way we want them to by virtue of the trait data being where we want it. Reporting (e.g. GET /rps and therefore CLI output) reflects "reality". Cons: Irreversible. Not backward compatible. Can't do it in a microversion.
(B) In the algorithms. (i) GET /rps and GET /a_cs queries need JOINs I can't even begin to comprehend. (ii) Do we tweak the outputs (GET /rps response and GET /a_cs provider_summaries) to report the "inherited" traits as well?
Pros: Can do it in a microversion. Cons: See "can't even begin to comprehend". Maybe I'm a dunce.
Perhaps this suggests a hybrid approach:
(C) Create a "ghost" table of inherited resource provider traits. If $old_microversion we ignore it; if $new_microversion we logically combine it with the existing rp traits table in all our queries.
Thoughts?
efried .