Hi, all

At last, I got the root cause of this 2 problem.
And I suggest add these words to https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html:
```
Prerequisites:
libvirt >= 7.9.0 . Like ubuntu-22.04, which use libvirt-8.0.0 by default.
```

Root cause of problem 1, which is "no valid host":
- Because libvirt version is too low.

Root cause of problem 2, which is "why there are topology in DPU in openstack create port command":
- Because add --binding-profile params in openstack create port command, which is NOT right.

----
Simon Jones


Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> 于2023年3月2日周四 20:30写道:
Hi {Sean, Simon},

> did you ever give a presentation on the DPU support

Yes, there were a couple at different stages.

The following is the one of the older ones that references the SMARTNIC VNIC type but we later switched to REMOTE_MANAGED in the final code: https://www.openvswitch.org/support/ovscon2021/slides/smartnic_port_binding.pdf, however, it has a useful diagram on page 15 which shows the interactions of different components. A lot of other content from it is present in the OpenStack docs now which we added during the feature development.

There is also a presentation with a demo that we did at the Open Infra summit https://youtu.be/Amxp-9yEnsU (I could not attend but we prepared the material after the features got merged).

Generally, as Sean described, the aim of this feature is to make the interaction between components present at the hypervisor and the DPU side automatic but, in order to make this workflow explicitly different from SR-IOV or offload at the hypervisor side, one has to use the "remote_managed" flag. This flag allows Nova to differentiate between "regular" VFs and the ones that have to be programmed by a remote host (DPU) - hence the name.

A port needs to be pre-created with the remote-managed type - that way when Nova tries to schedule a VM with that port attached, it will find hosts which actually have PCI devices tagged with the "remote_managed": "true" in the PCI whitelist.

The important thing to note here is that you must not use PCI passthrough directly for this - Nova will create a PCI device request automatically with the remote_managed flag included. There is currently no way to instruct Nova to choose one vendor/device ID vs the other for this (any remote_managed=true device from a pool will match) but maybe the work that was recently done to store PCI device information in the Placement service will pave the way for such granularity in the future.

Best Regards,
Dmitrii Shcherbakov
LP/MM/oftc: dmitriis


On Thu, Mar 2, 2023 at 1:54 PM Sean Mooney <smooney@redhat.com> wrote:
adding Dmitrii who was the primary developer of the openstack integration so
they can provide more insight.

Dmitrii did you ever give a presentationon the DPU support and how its configured/integrated
that might help fill in the gaps for simon?

more inline.

On Thu, 2023-03-02 at 11:05 +0800, Simon Jones wrote:
> E...
>
> But there are these things:
>
> 1) Show some real happened in my test:
>
> - Let me clear that, I use DPU in compute node:
> The graph in
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html .
>
> - I configure exactly follow
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html,
> which is said bellow in "3) Let me post all what I do follow this link".
>
> - In my test, I found after first three command (which is "openstack
> network create ...", "openstack subnet create", "openstack port create ..."),
> there are network topology exist in DPU side, and there are rules exist in
> OVN north DB, south DB of controller, like this:
>
> > ```
> > root@c1:~# ovn-nbctl show
> > switch 9bdacdd4-ca2a-4e35-82ca-0b5fbd3a5976
> > (neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69) (aka selfservice)
> >     port 01a68701-0e6a-4c30-bfba-904d1b9813e1
> >         addresses: ["unknown"]
> >     port 18a44c6f-af50-4830-ba86-54865abb60a1 (aka pf0vf1)
> >         addresses: ["fa:16:3e:13:36:e2 172.1.1.228"]
> >
> > gyw@c1:~$ sudo ovn-sbctl list Port_Binding
> > _uuid               : 61dc8bc0-ab33-4d67-ac13-0781f89c905a
> > chassis             : []
> > datapath            : 91d3509c-d794-496a-ba11-3706ebf143c8
> > encap               : []
> > external_ids        : {name=pf0vf1, "neutron:cidrs"="172.1.1.241/24",
> > "neutron:device_id"="", "neutron:device_owner"="",
> > "neutron:network_name"=neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69,
> > "neutron:port_name"=pf0vf1,
> > "neutron:project_id"="512866f9994f4ad8916d8539a7cdeec9",
> > "neutron:revision_number"="1",
> > "neutron:security_group_ids"="de8883e8-ccac-4be2-9bb2-95e732b0c114"}
> >
> > root@c1c2dpu:~# sudo ovs-vsctl show
> > 62cf78e5-2c02-471e-927e-1d69c2c22195
> >     Bridge br-int
> >         fail_mode: secure
> >         datapath_type: system
> >         Port br-int
> >             Interface br-int
> >                 type: internal
> >         Port ovn--1
> >             Interface ovn--1
> >                 type: geneve
> >                 options: {csum="true", key=flow, remote_ip="172.168.2.98"}
> >         Port pf0vf1
> >             Interface pf0vf1
> >     ovs_version: "2.17.2-24a81c8"
> > ```
> >
> That's why I guess "first three command" has already create network
> topology, and "openstack server create" command only need to plug VF into
> VM in HOST SIDE, DO NOT CALL NEUTRON. As network has already done.
no that jsut looks like the standard bridge toplogy that gets created when you provision
the dpu to be used with openstac vai ovn.

that looks unrelated to the neuton comamnd you ran.
>
> - In my test, then I run "openstack server create" command, I got ERROR
> which said "No valid host...", which is what the email said above.
> The reason has already said, it's nova-scheduler's PCI filter module report
> no valid host. The reason "nova-scheduler's PCI filter module report no
> valid host" is nova-scheduler could NOT see PCI information of compute
> node. The reason "nova-scheduler could NOT see PCI information of compute
> node" is compute node's /etc/nova/nova.conf configure remote_managed tag
> like this:
>
> > ```
> > [pci]
> > passthrough_whitelist = {"vendor_id": "15b3", "product_id": "101e",
> > "physical_network": null, "remote_managed": "true"}
> > alias = { "vendor_id":"15b3", "product_id":"101e",
> > "device_type":"type-VF", "name":"a1" }
> > ```
> >
>
> 2) Discuss some detail design of "remote_managed" tag, I don't know if this
> is right in the design of openstack with DPU:
>
> - In neutron-server side, use remote_managed tag in "openstack port create
> ..." command.
> This command will make neutron-server / OVN / ovn-controller / ovs to make
> the network topology done, like above said.
> I this this is right, because test shows that.
that is not correct
your test do not show what you think it does, they show the baisic bridge
toplogy and flow configuraiton that ovn installs by defualt when it manages
as ovs.

please read the design docs for this feature for both nova and neutron to understand how the interacction works.
https://specs.openstack.org/openstack/nova-specs/specs/yoga/implemented/integration-with-off-path-network-backends.html
https://specs.openstack.org/openstack/neutron-specs/specs/yoga/off-path-smartnic-dpu-port-binding-with-ovn.html
>
> - In nova side, there are 2 things should process, first is PCI passthrough
> filter, second is nova-compute to plug VF into VM.
>
> If the link above is right, which remote_managed tag exists in
> /etc/nova/nova.conf of controller node and exists in /etc/nova/nova.conf of
> compute node.
> As above ("- In my test, then I run "openstack server create" command")
> said, got ERROR in this step.
> So what should do in "PCI passthrough filter" ? How to configure ?
>
> Then, if "PCI passthrough filter" stage pass, what will do of  nova-compute
> in compute node?
>
> 3) Post all what I do follow this link:
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html.
> - build openstack physical env, link plug DPU into compute mode, use VM as
> controller ... etc.
> - build openstack nova, neutron, ovn, ovn-vif, ovs follow that link.
> - configure DPU side /etc/neutron/neutron.conf
> - configure host side /etc/nova/nova.conf
> - configure host side /etc/nova/nova-compute.conf
> - run first 3 command
> - last, run this command, got ERROR
>
> ----
> Simon Jones
>
>
> Sean Mooney <smooney@redhat.com> 于2023年3月1日周三 18:35写道:
>
> > On Wed, 2023-03-01 at 18:12 +0800, Simon Jones wrote:
> > > Thanks a lot !!!
> > >
> > > As you say, I follow
> > > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html.
> > > And I want to use DPU mode. Not "disable DPU mode".
> > > So I think I should follow the link above exactlly, so I use
> > > vnic-type=remote_anaged.
> > > In my opnion, after I run first three command (which is "openstack
> > network
> > > create ...", "openstack subnet create", "openstack port create ..."), the
> > > VF rep port and OVN and OVS rules are all ready.
> > not at that point nothign will have been done on ovn/ovs
> >
> > that will only happen after the port is bound to a vm and host.
> >
> > > What I should do in "openstack server create ..." is to JUST add PCI
> > device
> > > into VM, do NOT call neutron-server in nova-compute of compute node (
> > like
> > > call port_binding or something).
> > this is incorrect.
> > >
> > > But as the log and steps said in the emails above, nova-compute call
> > > port_binding to neutron-server while running the command "openstack
> > server
> > > create ...".
> > >
> > > So I still have questions is:
> > > 1) Is my opinion right? Which is "JUST add PCI device into VM, do NOT
> > call
> > > neutron-server in nova-compute of compute node ( like call port_binding
> > or
> > > something)" .
> > no this is not how its designed.
> > until you attach the logical port to a vm (either at runtime or as part of
> > vm create)
> > the logical port is not assocated with any host or phsical dpu/vf.
> >
> > so its not possibel to instanciate the openflow rules in ovs form the
> > logical switch model
> > in the ovn north db as no chassie info has been populated and we do not
> > have the dpu serial
> > info in the port binding details.
> > > 2) If it's right, how to deal with this? Which is how to JUST add PCI
> > > device into VM, do NOT call neutron-server? By command or by configure?
> > Is
> > > there come document ?
> > no this happens automaticaly when nova does the port binding which cannot
> > happen until after
> > teh vm is schduled to a host.
> > >
> > > ----
> > > Simon Jones
> > >
> > >
> > > Sean Mooney <smooney@redhat.com> 于2023年3月1日周三 16:15写道:
> > >
> > > > On Wed, 2023-03-01 at 15:20 +0800, Simon Jones wrote:
> > > > > BTW, this link (
> > > > >
> > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html)
> > > > said
> > > > > I SHOULD add "remote_managed" in /etc/nova/nova.conf, is that WRONG ?
> > > >
> > > > no its not wrong but for dpu smart nics you have to make a choice when
> > you
> > > > deploy
> > > > either they can be used in dpu mode in which case remote_managed
> > shoudl be
> > > > set to true
> > > > and you can only use them via neutron ports with
> > vnic-type=remote_managed
> > > > as descried in that doc
> > > >
> > > >
> > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html#launch-an-instance-with-remote-managed-port
> > > >
> > > >
> > > > or if you disable dpu mode in the nic frimware then you shoudl remvoe
> > > > remote_managed form the pci device list and
> > > > then it can be used liek a normal vf either for neutron sriov ports
> > > > vnic-type=direct or via flavor based pci passthough.
> > > >
> > > > the issue you were havign is you configured the pci device list to
> > contain
> > > > "remote_managed: ture" which means
> > > > the vf can only be consumed by a neutron port with
> > > > vnic-type=remote_managed, when you have "remote_managed: false" or
> > unset
> > > > you can use it via vnic-type=direct i forgot that slight detail that
> > > > vnic-type=remote_managed is required for "remote_managed: ture".
> > > >
> > > >
> > > > in either case you foudn the correct doc
> > > > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
> > > > neutorn sriov port configuration is documented here
> > > > https://docs.openstack.org/neutron/latest/admin/config-sriov.html
> > > > and nova flavor based pci passthough is documeted here
> > > > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
> > > >
> > > > all three server slightly differnt uses. both neutron proceedures are
> > > > exclusivly fo network interfaces.
> > > > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
> > > > requires the use of ovn deployed on the dpu
> > > > to configure the VF contolplane.
> > > > https://docs.openstack.org/neutron/latest/admin/config-sriov.html uses
> > > > the sriov nic agent
> > > > to manage the VF with ip tools.
> > > > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html is
> > > > intended for pci passthough
> > > > of stateless acclerorators like qat devices. while the nova flavor
> > approch
> > > > cna be used with nics it not how its generally
> > > > ment to be used and when used to passthough a nic expectation is that
> > its
> > > > not related to a neuton network.
> > > >
> > > >
> >
> >