<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Sorry for the late response, <br>
<br>
Here is my thoughts on "resource provider affinity".<br>
<br>
“The rps are in a same subtree” is equivalent to “there exits an rp
which is an ancestor of all the other rps”<br>
<br>
Therefore,<br>
* group_resources=1:2<br>
means “rp2 is a descendent of rp1 (or rp1 is a descendent of rp2.)”<br>
<br>
We can extend it to cases we have more than two groups:<br
style="box-sizing: border-box;">
* group_resources=1:2:3<br>
means "both rp2 and rp3 are descendents of rp1 (or both rp1 and rp3
are of rp2 or both rp1 and rp2 are of rp3)<br>
<br>
Eric's question from PTG yesterday was whether to keep the symmetry
between rps,<br>
that is, whether to take the conditions enclosed in the parentheses
above.<br>
<br>
I would say yes keep the symmetry because<br>
<br>
1. the expression 1:2:3 is more of symmetry. If we want to make it
asymmetric, it should express the subtree root more explicitly like
1-2:3 or 1-2:3:4.<br>
2. callers may not be aware of which resource (VCPU or VF) is
provided by the upper/lower rp.<br>
IOW, the caller - resource retriever (scheduler) - doesn't want
to know how the reporter - virt driver - has reported the resouces.<br>
<br>
Note that even in the symmetric world the negative expression jay
suggested looks good to me.<br>
It enables something like:<br>
* group_resources=1:2:!3:!4<br style="box-sizing: border-box;">
which means 1 and 2 should be in the same group but 3 shoudn't be
the descendents of 1 or 2, so as 4.<br>
<br>
However, speaking in the design level, the adjacency list model (so
called naive tree model), which we currently use for nested rps,<br>
is not good at retrieving subtrees (compared to e.g. nested set
model[1]).<br>
[1] <a href="https://en.wikipedia.org/wiki/Nested_set_model">https://en.wikipedia.org/wiki/Nested_set_model</a><br>
<br>
I have looked into recursive SQL CTE (common table expression)
feature which help us treat subtree easily in adjacency list model
in a experimental patch [2],<br>
but unfortunately it looks like the feature is still experimental in
MySQL, and we don't want to query like this per every candidates, do
we? :(<br>
<br>
[2] <a href="https://review.opendev.org/#/c/636092/">https://review.opendev.org/#/c/636092/</a><br>
<br>
Therefore, for this specific use case of NUMA affinity I'd like
alternatively propose bringing a concept of resource group distance
in the rp graph.<br>
<br>
* numa affinity case<br style="box-sizing: border-box;">
- group_distance(1:2)=1<br>
* anti numa affinity<br style="box-sizing: border-box;">
- group_distance(1:2)>1<br>
<br>
which can be realized by looking into the cached adjacency rp (i.e.
parent id)<br style="box-sizing: border-box;">
(supporting group_distance=N (N>1) would be a future research or
implement anyway overlooking the performance)<br>
<br>
One drawback of this is that we can't use this if you create
multiple nested layers with more than 1 depth under NUMA rps,<br>
but is that the case for OvS bandwidth?<br>
<br>
Another alternative is having a "closure table" from where we can
retrieve all the descendent rp ids of an rp without joining tables.<br>
but... online migration cost?<br>
<br>
- tetsuro<br>
<br>
</body>
</html>