<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto">Mike,<div><br></div><div>Here is the bug which I reported <a href="https://bugs.launchpad.net/bugs/1795920">https://bugs.launchpad.net/bugs/1795920</a></div><div><br></div><div>Cc'ing: Sean <br><br><div id="AppleMailSignature">Sent from my iPhone</div><div><br>On Nov 12, 2018, at 8:27 AM, Satish Patel <<a href="mailto:satish.txt@gmail.com">satish.txt@gmail.com</a>> wrote:<br><br></div><blockquote type="cite"><div><meta http-equiv="content-type" content="text/html; charset=utf-8">Mike,<div><br></div><div>I had same issue month ago when I roll out sriov in my cloud and this is what I did to solve this issue. Set following in flavor </div><div><br></div><div><pre style="box-sizing: border-box; overflow: auto; padding: 20px 30px; margin-top: 0px; margin-bottom: 10px; line-height: 1.42857143; word-break: break-all; word-wrap: break-word; border: 1px solid rgb(204, 204, 204); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px;"><font face="UICTFontTextStyleBody"><span style="white-space: normal; background-color: rgba(255, 255, 255, 0);">hw:numa_nodes<span class="o" style="box-sizing: border-box;">=</span><span class="m" style="box-sizing: border-box;">2</span></span></font></pre></div><div><br></div><div>It will spread out instance vcpu across numa, yes there will be little penalty but if you tune your application according they you are good </div><div><br></div><div>Yes this is bug I have already open ticket and I believe folks are working on it but its not simple fix. They may release new feature in coming oprnstack release. </div><div><br><div id="AppleMailSignature">Sent from my iPhone</div><div><br>On Nov 11, 2018, at 9:25 PM, Mike Joseph <<a href="mailto:mj@mode.net">mj@mode.net</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi folks,<div><br></div><div>It appears that the numa_policy attribute of a PCI alias is ignored for flavors referencing that alias if the flavor also has hw:cpu_policy=dedicated set. The alias config is:</div><div><br></div><div><div>alias = { "name": "mlx", "device_type": "type-VF", "vendor_id": "15b3", "product_id": "1004", "numa_policy": "preferred" }</div></div><div><br></div><div>And the flavor config is:</div><div><br></div><div><div>{</div><div> "OS-FLV-DISABLED:disabled": false,</div><div> "OS-FLV-EXT-DATA:ephemeral": 0,</div><div> "access_project_ids": null,</div><div> "disk": 10,</div><div> "id": "221e1bcd-2dde-48e6-bd09-820012198908",</div><div> "name": "vm-2",</div><div> "os-flavor-access:is_public": true,</div><div> "properties": "hw:cpu_policy='dedicated', pci_passthrough:alias='mlx:1'",</div><div> "ram": 8192,</div><div> "rxtx_factor": 1.0,</div><div> "swap": "",</div><div> "vcpus": 2</div><div>}</div></div><div><br></div><div>In short, our compute nodes have an SR-IOV Mellanox NIC (ConnectX-3) with 16 VFs configured. We wish to expose these VFs to VMs that schedule on the host. However, the NIC is in NUMA region 0 which means that only half of the compute node's CPU cores would be usable if we required VM affinity to the NIC's NUMA region. But we don't need that, since we are okay with cross-region access to the PCI device.</div><div><br></div><div>However, we do need CPU pinning to work, in order to have efficient cache hits on our VM processes. Therefore, we still want to pin our vCPUs to pCPUs, even if the pins end up on on a NUMA region opposite of the NIC. The spec for numa_policy seem to indicate that this is exactly the intent of the option:</div><div><br></div><div><a href="https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html">https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html</a><br></div><div><br></div><div>But, with the above config, we still get PCI affinity scheduling errors:</div><div><br></div><div>'Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology.'<br></div><div><br></div><div>This strikes me as a bug, but perhaps I am missing something here?<br></div><div><br></div><div>Thanks,</div><div>MJ</div></div></div></div></div></div></div>
</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a></span><br><span>Post to : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a></span><br><span>Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a></span><br></div></blockquote></div></div></blockquote></div></body></html>