<div dir="ltr"><div dir="ltr"><div>Contribute an another idea at here. Pretty sure I didn't explore this with all the cases by my limited vision.</div><div><br></div><div>So I'm thinking we can continue use query string build a tree structure by the request group number. I know the number request group problem for the cyborg and neutron, but I think there must be some way to describe the cyborg device will be attached to which instance numa node. So I guess that it isn't the fault of number request group, maybe we are just missing a way to describe that.</div><div><br></div><div>For the case in the spec <a href="https://review.openstack.org/#/c/650476">https://review.openstack.org/#/c/650476</a>, an instance with one numa node and two VFs from different network. We can write as below:</div><div><br></div><div>?resources=DISK_GB:10&</div><div>resources1=VCPU:2,MEMORY_MB:128&</div><div>resources1.1=VF:1&required=NET_A</div><div>resources1.2=VF:1&required=NET_B</div><div><br></div><div>Another example, we request an instance with two numa nodes, 2 vcpus and 128mb memory in each node. In each node has two VFs come from different PF to have HA.</div><div><br></div><div>?resources=DISK_GB:10&</div><div>resources1=VCPU:2,MEMORY_MB:128&</div><div>resources1.1=VF:1&</div><div>resources1.2=VF:1&</div><div>resources2=VCPU:2,MEMORY_MB:128&</div><div>resources2.1=VF:1&</div><div>resources2.2=VF:1&</div><div>group_policy=isolate&</div><div>group_policy1=isolate&</div><div>group_policy2=isolate</div><div><br></div><div>The `group_policy` ensure the resources1 and resources2 aren't coming from the same RP. The 'group_poilcy1' ensures `resource1.x` aren't coming from the same RP. The `group_policy2` ensures `resources2.x` aren't coming from same RP.</div><div><br></div><div>For the cyborg case, I think we can propose the flavor extra specs as below:</div><div>accel:device_profile.[numa node id]=<profile_name></div><div><br></div><div>Then we will know the user hope the cyborg device being attach to which instance numa node.</div><div><br></div><div>The cyborg only needs to return un-numbered request group, then Nova will base on all the 'hw:xxx' extra specs and 'accel:device_profile.[numa node id]' to generate a placement request like above.</div><div><br></div><div>For example, if it is PCI device under first numa node, the extra spec will be 'accel:device_profile.0=<profile_name>' the cyborg can return a simple request 'resources=CYBORG_PCI_XX_DEVICE:1', then we merge this into the request group 'resources1=VCPU:2,MEMORY_MB:128,CYBORG_PCI_XX_DEVICE:1'. If the pci device has a special trait, then cyborg should return request group as 'resources1=CYBORG_PCI_XX_DEVICE:1&required=SOME_TRAIT', then nova merge this into placement request as 'resources1.1'.</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Chris Dent <<a href="mailto:cdent%2Bos@anticdent.org">cdent+os@anticdent.org</a>> 于2019年4月9日周二 下午8:42写道：<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

Spec: <a href="https://review.openstack.org/650476" rel="noreferrer" target="_blank">https://review.openstack.org/650476</a><br>

<br>

>From the commit message:<br>

<br>

     To support NUMA and similar concepts, this proposes the ability<br>

     to request resources from different providers nested under a<br>

     common subtree (below the root provider).<br>

<br>

There's much in the feature described by the spec and the surrounding<br>

context that is frequently a source of contention in the placement<br>

group, so working through this spec is probably going to require<br>

some robust discussion. Doing most of that before the PTG will help<br>

make sure we're not going in circles in person.k<br>

<br>

Some of the areas of potential contention:<br>

<br>

* Adequate for limited but maybe not all use case solutions<br>

* Strict trait constructionism<br>

* Evolving the complexity of placement solely for the satisfaction<br>

   of hardware representation in Nova<br>

* Inventory-less resource providers<br>

* Developing new features in placement before existing features are<br>

   fully used in client services<br>

* Others?<br>

<br>

I list this not because they are deal breakers or the only thing<br>

that matters, but because they have presented stumbling blocks in<br>

the past and we may as well work to address them (or make an<br>

agreement to punt them until later) otherwise there will be<br>

lingering dread.<br>

<br>

And, beyond all that squishy stuff, there is the necessary<br>

discussion over the solution described in the spec. There are<br>

several alternatives listed in the spec, and a few more in the<br>

comments. We'd like to figure out the best solution that can<br>

actually be done in a reasonable amount of time, not the best<br>

solution in the absolute.<br>

<br>

Discuss!<br>

<br>

-- <br>

Chris Dent                       ٩◔̯◔۶           <a href="https://anticdent.org/" rel="noreferrer" target="_blank">https://anticdent.org/</a><br>

freenode: cdent                                         tw: @anticdent</blockquote></div>