Hi, all At last, I got the root cause of this 2 problem. And I suggest add these words to https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html: ``` Prerequisites: libvirt >= 7.9.0 . Like ubuntu-22.04, which use libvirt-8.0.0 by default. ``` Root cause of problem 1, which is "no valid host": - Because libvirt version is too low. Root cause of problem 2, which is "why there are topology in DPU in openstack create port command": - Because add --binding-profile params in openstack create port command, which is NOT right. ---- Simon Jones Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> 于2023年3月2日周四 20:30写道:
Hi {Sean, Simon},
did you ever give a presentation on the DPU support
Yes, there were a couple at different stages.
The following is the one of the older ones that references the SMARTNIC VNIC type but we later switched to REMOTE_MANAGED in the final code: https://www.openvswitch.org/support/ovscon2021/slides/smartnic_port_binding...., however, it has a useful diagram on page 15 which shows the interactions of different components. A lot of other content from it is present in the OpenStack docs now which we added during the feature development.
There is also a presentation with a demo that we did at the Open Infra summit https://youtu.be/Amxp-9yEnsU (I could not attend but we prepared the material after the features got merged).
Generally, as Sean described, the aim of this feature is to make the interaction between components present at the hypervisor and the DPU side automatic but, in order to make this workflow explicitly different from SR-IOV or offload at the hypervisor side, one has to use the "remote_managed" flag. This flag allows Nova to differentiate between "regular" VFs and the ones that have to be programmed by a remote host (DPU) - hence the name.
A port needs to be pre-created with the remote-managed type - that way when Nova tries to schedule a VM with that port attached, it will find hosts which actually have PCI devices tagged with the "remote_managed": "true" in the PCI whitelist.
The important thing to note here is that you must not use PCI passthrough directly for this - Nova will create a PCI device request automatically with the remote_managed flag included. There is currently no way to instruct Nova to choose one vendor/device ID vs the other for this (any remote_managed=true device from a pool will match) but maybe the work that was recently done to store PCI device information in the Placement service will pave the way for such granularity in the future.
Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis
On Thu, Mar 2, 2023 at 1:54 PM Sean Mooney <smooney@redhat.com> wrote:
adding Dmitrii who was the primary developer of the openstack integration so they can provide more insight.
Dmitrii did you ever give a presentationon the DPU support and how its configured/integrated that might help fill in the gaps for simon?
more inline.
E...
But there are these things:
1) Show some real happened in my test:
- Let me clear that, I use DPU in compute node: The graph in https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html .
- I configure exactly follow https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html, which is said bellow in "3) Let me post all what I do follow this link".
- In my test, I found after first three command (which is "openstack network create ...", "openstack subnet create", "openstack port create ..."), there are network topology exist in DPU side, and there are rules exist in OVN north DB, south DB of controller, like this:
``` root@c1:~# ovn-nbctl show switch 9bdacdd4-ca2a-4e35-82ca-0b5fbd3a5976 (neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69) (aka selfservice) port 01a68701-0e6a-4c30-bfba-904d1b9813e1 addresses: ["unknown"] port 18a44c6f-af50-4830-ba86-54865abb60a1 (aka pf0vf1) addresses: ["fa:16:3e:13:36:e2 172.1.1.228"]
gyw@c1:~$ sudo ovn-sbctl list Port_Binding _uuid : 61dc8bc0-ab33-4d67-ac13-0781f89c905a chassis : [] datapath : 91d3509c-d794-496a-ba11-3706ebf143c8 encap : [] external_ids : {name=pf0vf1, "neutron:cidrs"="172.1.1.241/24", "neutron:device_id"="", "neutron:device_owner"="", "neutron:network_name"=neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69, "neutron:port_name"=pf0vf1, "neutron:project_id"="512866f9994f4ad8916d8539a7cdeec9", "neutron:revision_number"="1", "neutron:security_group_ids"="de8883e8-ccac-4be2-9bb2-95e732b0c114"}
root@c1c2dpu:~# sudo ovs-vsctl show 62cf78e5-2c02-471e-927e-1d69c2c22195 Bridge br-int fail_mode: secure datapath_type: system Port br-int Interface br-int type: internal Port ovn--1 Interface ovn--1 type: geneve options: {csum="true", key=flow,
remote_ip="172.168.2.98"}
Port pf0vf1 Interface pf0vf1 ovs_version: "2.17.2-24a81c8" ```
That's why I guess "first three command" has already create network topology, and "openstack server create" command only need to plug VF into VM in HOST SIDE, DO NOT CALL NEUTRON. As network has already done. no that jsut looks like the standard bridge toplogy that gets created when you provision
On Thu, 2023-03-02 at 11:05 +0800, Simon Jones wrote: the dpu to be used with openstac vai ovn.
that looks unrelated to the neuton comamnd you ran.
- In my test, then I run "openstack server create" command, I got ERROR which said "No valid host...", which is what the email said above. The reason has already said, it's nova-scheduler's PCI filter module
no valid host. The reason "nova-scheduler's PCI filter module report no valid host" is nova-scheduler could NOT see PCI information of compute node. The reason "nova-scheduler could NOT see PCI information of compute node" is compute node's /etc/nova/nova.conf configure remote_managed tag like this:
``` [pci] passthrough_whitelist = {"vendor_id": "15b3", "product_id": "101e", "physical_network": null, "remote_managed": "true"} alias = { "vendor_id":"15b3", "product_id":"101e", "device_type":"type-VF", "name":"a1" } ```
2) Discuss some detail design of "remote_managed" tag, I don't know if
is right in the design of openstack with DPU:
- In neutron-server side, use remote_managed tag in "openstack port create ..." command. This command will make neutron-server / OVN / ovn-controller / ovs to make the network topology done, like above said. I this this is right, because test shows that.
report this that is not correct your test do not show what you think it does, they show the baisic bridge toplogy and flow configuraiton that ovn installs by defualt when it manages as ovs.
please read the design docs for this feature for both nova and neutron to understand how the interacction works.
https://specs.openstack.org/openstack/nova-specs/specs/yoga/implemented/inte...
https://specs.openstack.org/openstack/neutron-specs/specs/yoga/off-path-smar...
- In nova side, there are 2 things should process, first is PCI
filter, second is nova-compute to plug VF into VM.
If the link above is right, which remote_managed tag exists in /etc/nova/nova.conf of controller node and exists in /etc/nova/nova.conf of compute node. As above ("- In my test, then I run "openstack server create" command") said, got ERROR in this step. So what should do in "PCI passthrough filter" ? How to configure ?
Then, if "PCI passthrough filter" stage pass, what will do of nova-compute in compute node?
3) Post all what I do follow this link: https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html. - build openstack physical env, link plug DPU into compute mode, use VM as controller ... etc. - build openstack nova, neutron, ovn, ovn-vif, ovs follow that link. - configure DPU side /etc/neutron/neutron.conf - configure host side /etc/nova/nova.conf - configure host side /etc/nova/nova-compute.conf - run first 3 command - last, run this command, got ERROR
---- Simon Jones
Sean Mooney <smooney@redhat.com> 于2023年3月1日周三 18:35写道:
On Wed, 2023-03-01 at 18:12 +0800, Simon Jones wrote:
Thanks a lot !!!
As you say, I follow
https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html.
And I want to use DPU mode. Not "disable DPU mode". So I think I should follow the link above exactlly, so I use vnic-type=remote_anaged. In my opnion, after I run first three command (which is "openstack network create ...", "openstack subnet create", "openstack port create ..."), the VF rep port and OVN and OVS rules are all ready. not at that point nothign will have been done on ovn/ovs
that will only happen after the port is bound to a vm and host.
What I should do in "openstack server create ..." is to JUST add PCI device into VM, do NOT call neutron-server in nova-compute of compute node ( like call port_binding or something). this is incorrect.
But as the log and steps said in the emails above, nova-compute call port_binding to neutron-server while running the command "openstack server create ...".
So I still have questions is: 1) Is my opinion right? Which is "JUST add PCI device into VM, do NOT call neutron-server in nova-compute of compute node ( like call
or
something)" . no this is not how its designed. until you attach the logical port to a vm (either at runtime or as
vm create) the logical port is not assocated with any host or phsical dpu/vf.
so its not possibel to instanciate the openflow rules in ovs form the logical switch model in the ovn north db as no chassie info has been populated and we do not have the dpu serial info in the port binding details.
2) If it's right, how to deal with this? Which is how to JUST add PCI device into VM, do NOT call neutron-server? By command or by configure? Is there come document ? no this happens automaticaly when nova does the port binding which cannot happen until after teh vm is schduled to a host.
---- Simon Jones
Sean Mooney <smooney@redhat.com> 于2023年3月1日周三 16:15写道:
On Wed, 2023-03-01 at 15:20 +0800, Simon Jones wrote: > BTW, this link ( > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html ) said > I SHOULD add "remote_managed" in /etc/nova/nova.conf, is that WRONG ?
no its not wrong but for dpu smart nics you have to make a choice when you deploy either they can be used in dpu mode in which case remote_managed shoudl be set to true and you can only use them via neutron ports with vnic-type=remote_managed as descried in that doc
https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html#launch...
or if you disable dpu mode in the nic frimware then you shoudl
remvoe
remote_managed form the pci device list and then it can be used liek a normal vf either for neutron sriov
vnic-type=direct or via flavor based pci passthough.
the issue you were havign is you configured the pci device list to contain "remote_managed: ture" which means the vf can only be consumed by a neutron port with vnic-type=remote_managed, when you have "remote_managed: false" or unset you can use it via vnic-type=direct i forgot that slight detail
vnic-type=remote_managed is required for "remote_managed: ture".
in either case you foudn the correct doc
https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
neutorn sriov port configuration is documented here https://docs.openstack.org/neutron/latest/admin/config-sriov.html and nova flavor based pci passthough is documeted here https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
all three server slightly differnt uses. both neutron proceedures are exclusivly fo network interfaces.
https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
requires the use of ovn deployed on the dpu to configure the VF contolplane. https://docs.openstack.org/neutron/latest/admin/config-sriov.html uses the sriov nic agent to manage the VF with ip tools. https://docs.openstack.org/nova/latest/admin/pci-passthrough.html is intended for pci passthough of stateless acclerorators like qat devices. while the nova flavor approch cna be used with nics it not how its generally ment to be used and when used to passthough a nic expectation is
passthrough port_binding part of ports that that
its
not related to a neuton network.