Sorry for the late response, Here is my thoughts on "resource provider affinity". “The rps are in a same subtree” is equivalent to “there exits an rp which is an ancestor of all the other rps” Therefore, * group_resources=1:2 means “rp2 is a descendent of rp1 (or rp1 is a descendent of rp2.)” We can extend it to cases we have more than two groups: * group_resources=1:2:3 means "both rp2 and rp3 are descendents of rp1 (or both rp1 and rp3 are of rp2 or both rp1 and rp2 are of rp3) Eric's question from PTG yesterday was whether to keep the symmetry between rps, that is, whether to take the conditions enclosed in the parentheses above. I would say yes keep the symmetry because 1. the expression 1:2:3 is more of symmetry. If we want to make it asymmetric, it should express the subtree root more explicitly like 1-2:3 or 1-2:3:4. 2. callers may not be aware of which resource (VCPU or VF) is provided by the upper/lower rp. IOW, the caller - resource retriever (scheduler) - doesn't want to know how the reporter - virt driver - has reported the resouces. Note that even in the symmetric world the negative expression jay suggested looks good to me. It enables something like: * group_resources=1:2:!3:!4 which means 1 and 2 should be in the same group but 3 shoudn't be the descendents of 1 or 2, so as 4. However, speaking in the design level, the adjacency list model (so called naive tree model), which we currently use for nested rps, is not good at retrieving subtrees (compared to e.g. nested set model[1]). [1] https://en.wikipedia.org/wiki/Nested_set_model I have looked into recursive SQL CTE (common table expression) feature which help us treat subtree easily in adjacency list model in a experimental patch [2], but unfortunately it looks like the feature is still experimental in MySQL, and we don't want to query like this per every candidates, do we? :( [2] https://review.opendev.org/#/c/636092/ Therefore, for this specific use case of NUMA affinity I'd like alternatively propose bringing a concept of resource group distance in the rp graph. * numa affinity case - group_distance(1:2)=1 * anti numa affinity - group_distance(1:2)>1 which can be realized by looking into the cached adjacency rp (i.e. parent id) (supporting group_distance=N (N>1) would be a future research or implement anyway overlooking the performance) One drawback of this is that we can't use this if you create multiple nested layers with more than 1 depth under NUMA rps, but is that the case for OvS bandwidth? Another alternative is having a "closure table" from where we can retrieve all the descendent rp ids of an rp without joining tables. but... online migration cost? - tetsuro