<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Sorry for the late response, <br>

    <br>

    Here is my thoughts on "resource provider affinity".<br>

    <br>

    “The rps are in a same subtree” is equivalent to “there exits an rp

    which is an ancestor of all the other rps”<br>

    <br>

    Therefore,<br>

    * group_resources=1:2<br>

    means “rp2 is a descendent of rp1 (or rp1 is a descendent of rp2.)”<br>

    <br>

    We can extend it to cases we have more than two groups:<br

      style="box-sizing: border-box;">

    * group_resources=1:2:3<br>

    means "both rp2 and rp3 are descendents of rp1 (or both rp1 and rp3

    are of rp2 or both rp1 and rp2 are of rp3)<br>

    <br>

    Eric's question from PTG yesterday was whether to keep the symmetry

    between rps,<br>

    that is, whether to take the conditions enclosed in the parentheses

    above.<br>

    <br>

    I would say yes keep the symmetry because<br>

    <br>

    1. the expression 1:2:3 is more of symmetry. If we want to make it

    asymmetric, it should express the subtree root more explicitly like

    1-2:3 or 1-2:3:4.<br>

    2. callers may not be aware of which resource (VCPU or VF) is

    provided by the upper/lower rp.<br>

        IOW, the caller - resource retriever (scheduler) -  doesn't want

    to know how the reporter - virt driver - has reported the resouces.<br>

    <br>

    Note that even in the symmetric world the negative expression jay

    suggested looks good to me.<br>

    It enables something like:<br>

    * group_resources=1:2:!3:!4<br style="box-sizing: border-box;">

    which means 1 and 2 should be in the same group but 3 shoudn't be

    the descendents of 1 or 2, so as 4.<br>

    <br>

    However, speaking in the design level, the adjacency list model (so

    called naive tree model), which we currently use for nested rps,<br>

    is not good at retrieving subtrees (compared to e.g. nested set

    model[1]).<br>

    [1] <a href="https://en.wikipedia.org/wiki/Nested_set_model">https://en.wikipedia.org/wiki/Nested_set_model</a><br>

    <br>

    I have looked into recursive SQL CTE (common table expression)

    feature which help us treat subtree easily in adjacency list model

    in a experimental patch [2],<br>

    but unfortunately it looks like the feature is still experimental in

    MySQL, and we don't want to query like this per every candidates, do

    we? :(<br>

    <br>

    [2] <a href="https://review.opendev.org/#/c/636092/">https://review.opendev.org/#/c/636092/</a><br>

    <br>

    Therefore, for this specific use case of NUMA affinity I'd like

    alternatively propose bringing a concept of resource group distance

    in the rp graph.<br>

    <br>

    * numa affinity case<br style="box-sizing: border-box;">

      - group_distance(1:2)=1<br>

    * anti numa affinity<br style="box-sizing: border-box;">

      - group_distance(1:2)>1<br>

    <br>

    which can be realized by looking into the cached adjacency rp (i.e.

    parent id)<br style="box-sizing: border-box;">

    (supporting group_distance=N (N>1) would be a future research or

    implement anyway overlooking the performance)<br>

    <br>

    One drawback of this is that we can't use this if you create

    multiple nested layers with more than 1 depth under NUMA rps,<br>

    but is that the case for OvS bandwidth?<br>

    <br>

    Another alternative is having a "closure table" from where we can

    retrieve all the descendent rp ids of an rp without joining tables.<br>

    but... online migration cost?<br>

    <br>

    - tetsuro<br>

    <br>

  </body>

</html>