[openstack-dev][PCI passthrough] How to use PCI passthrough feature correctly? And is this BUG in update_devices_from_hypervisor_resources?

Dmitrii Shcherbakov dmitrii.shcherbakov at canonical.com
Thu Mar 2 12:29:25 UTC 2023


Hi {Sean, Simon},

> did you ever give a presentation on the DPU support

Yes, there were a couple at different stages.

The following is the one of the older ones that references the SMARTNIC
VNIC type but we later switched to REMOTE_MANAGED in the final code:
https://www.openvswitch.org/support/ovscon2021/slides/smartnic_port_binding.pdf,
however, it has a useful diagram on page 15 which shows the interactions of
different components. A lot of other content from it is present in the
OpenStack docs now which we added during the feature development.

There is also a presentation with a demo that we did at the Open Infra
summit https://youtu.be/Amxp-9yEnsU (I could not attend but we prepared the
material after the features got merged).

Generally, as Sean described, the aim of this feature is to make the
interaction between components present at the hypervisor and the DPU side
automatic but, in order to make this workflow explicitly different from
SR-IOV or offload at the hypervisor side, one has to use the
"remote_managed" flag. This flag allows Nova to differentiate between
"regular" VFs and the ones that have to be programmed by a remote host
(DPU) - hence the name.

A port needs to be pre-created with the remote-managed type - that way when
Nova tries to schedule a VM with that port attached, it will find hosts
which actually have PCI devices tagged with the "remote_managed": "true" in
the PCI whitelist.

The important thing to note here is that you must not use PCI passthrough
directly for this - Nova will create a PCI device request automatically
with the remote_managed flag included. There is currently no way to
instruct Nova to choose one vendor/device ID vs the other for this (any
remote_managed=true device from a pool will match) but maybe the work that
was recently done to store PCI device information in the Placement service
will pave the way for such granularity in the future.

Best Regards,
Dmitrii Shcherbakov
LP/MM/oftc: dmitriis


On Thu, Mar 2, 2023 at 1:54 PM Sean Mooney <smooney at redhat.com> wrote:

> adding Dmitrii who was the primary developer of the openstack integration
> so
> they can provide more insight.
>
> Dmitrii did you ever give a presentationon the DPU support and how its
> configured/integrated
> that might help fill in the gaps for simon?
>
> more inline.
>
> On Thu, 2023-03-02 at 11:05 +0800, Simon Jones wrote:
> > E...
> >
> > But there are these things:
> >
> > 1) Show some real happened in my test:
> >
> > - Let me clear that, I use DPU in compute node:
> > The graph in
> > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html .
> >
> > - I configure exactly follow
> > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html,
> > which is said bellow in "3) Let me post all what I do follow this link".
> >
> > - In my test, I found after first three command (which is "openstack
> > network create ...", "openstack subnet create", "openstack port create
> ..."),
> > there are network topology exist in DPU side, and there are rules exist
> in
> > OVN north DB, south DB of controller, like this:
> >
> > > ```
> > > root at c1:~# ovn-nbctl show
> > > switch 9bdacdd4-ca2a-4e35-82ca-0b5fbd3a5976
> > > (neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69) (aka selfservice)
> > >     port 01a68701-0e6a-4c30-bfba-904d1b9813e1
> > >         addresses: ["unknown"]
> > >     port 18a44c6f-af50-4830-ba86-54865abb60a1 (aka pf0vf1)
> > >         addresses: ["fa:16:3e:13:36:e2 172.1.1.228"]
> > >
> > > gyw at c1:~$ sudo ovn-sbctl list Port_Binding
> > > _uuid               : 61dc8bc0-ab33-4d67-ac13-0781f89c905a
> > > chassis             : []
> > > datapath            : 91d3509c-d794-496a-ba11-3706ebf143c8
> > > encap               : []
> > > external_ids        : {name=pf0vf1, "neutron:cidrs"="172.1.1.241/24",
> > > "neutron:device_id"="", "neutron:device_owner"="",
> > > "neutron:network_name"=neutron-066c8dc2-c98b-4fb8-a541-8b367e8f6e69,
> > > "neutron:port_name"=pf0vf1,
> > > "neutron:project_id"="512866f9994f4ad8916d8539a7cdeec9",
> > > "neutron:revision_number"="1",
> > > "neutron:security_group_ids"="de8883e8-ccac-4be2-9bb2-95e732b0c114"}
> > >
> > > root at c1c2dpu:~# sudo ovs-vsctl show
> > > 62cf78e5-2c02-471e-927e-1d69c2c22195
> > >     Bridge br-int
> > >         fail_mode: secure
> > >         datapath_type: system
> > >         Port br-int
> > >             Interface br-int
> > >                 type: internal
> > >         Port ovn--1
> > >             Interface ovn--1
> > >                 type: geneve
> > >                 options: {csum="true", key=flow,
> remote_ip="172.168.2.98"}
> > >         Port pf0vf1
> > >             Interface pf0vf1
> > >     ovs_version: "2.17.2-24a81c8"
> > > ```
> > >
> > That's why I guess "first three command" has already create network
> > topology, and "openstack server create" command only need to plug VF into
> > VM in HOST SIDE, DO NOT CALL NEUTRON. As network has already done.
> no that jsut looks like the standard bridge toplogy that gets created when
> you provision
> the dpu to be used with openstac vai ovn.
>
> that looks unrelated to the neuton comamnd you ran.
> >
> > - In my test, then I run "openstack server create" command, I got ERROR
> > which said "No valid host...", which is what the email said above.
> > The reason has already said, it's nova-scheduler's PCI filter module
> report
> > no valid host. The reason "nova-scheduler's PCI filter module report no
> > valid host" is nova-scheduler could NOT see PCI information of compute
> > node. The reason "nova-scheduler could NOT see PCI information of compute
> > node" is compute node's /etc/nova/nova.conf configure remote_managed tag
> > like this:
> >
> > > ```
> > > [pci]
> > > passthrough_whitelist = {"vendor_id": "15b3", "product_id": "101e",
> > > "physical_network": null, "remote_managed": "true"}
> > > alias = { "vendor_id":"15b3", "product_id":"101e",
> > > "device_type":"type-VF", "name":"a1" }
> > > ```
> > >
> >
> > 2) Discuss some detail design of "remote_managed" tag, I don't know if
> this
> > is right in the design of openstack with DPU:
> >
> > - In neutron-server side, use remote_managed tag in "openstack port
> create
> > ..." command.
> > This command will make neutron-server / OVN / ovn-controller / ovs to
> make
> > the network topology done, like above said.
> > I this this is right, because test shows that.
> that is not correct
> your test do not show what you think it does, they show the baisic bridge
> toplogy and flow configuraiton that ovn installs by defualt when it manages
> as ovs.
>
> please read the design docs for this feature for both nova and neutron to
> understand how the interacction works.
>
> https://specs.openstack.org/openstack/nova-specs/specs/yoga/implemented/integration-with-off-path-network-backends.html
>
> https://specs.openstack.org/openstack/neutron-specs/specs/yoga/off-path-smartnic-dpu-port-binding-with-ovn.html
> >
> > - In nova side, there are 2 things should process, first is PCI
> passthrough
> > filter, second is nova-compute to plug VF into VM.
> >
> > If the link above is right, which remote_managed tag exists in
> > /etc/nova/nova.conf of controller node and exists in /etc/nova/nova.conf
> of
> > compute node.
> > As above ("- In my test, then I run "openstack server create" command")
> > said, got ERROR in this step.
> > So what should do in "PCI passthrough filter" ? How to configure ?
> >
> > Then, if "PCI passthrough filter" stage pass, what will do of
> nova-compute
> > in compute node?
> >
> > 3) Post all what I do follow this link:
> > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html.
> > - build openstack physical env, link plug DPU into compute mode, use VM
> as
> > controller ... etc.
> > - build openstack nova, neutron, ovn, ovn-vif, ovs follow that link.
> > - configure DPU side /etc/neutron/neutron.conf
> > - configure host side /etc/nova/nova.conf
> > - configure host side /etc/nova/nova-compute.conf
> > - run first 3 command
> > - last, run this command, got ERROR
> >
> > ----
> > Simon Jones
> >
> >
> > Sean Mooney <smooney at redhat.com> 于2023年3月1日周三 18:35写道:
> >
> > > On Wed, 2023-03-01 at 18:12 +0800, Simon Jones wrote:
> > > > Thanks a lot !!!
> > > >
> > > > As you say, I follow
> > > >
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html.
> > > > And I want to use DPU mode. Not "disable DPU mode".
> > > > So I think I should follow the link above exactlly, so I use
> > > > vnic-type=remote_anaged.
> > > > In my opnion, after I run first three command (which is "openstack
> > > network
> > > > create ...", "openstack subnet create", "openstack port create
> ..."), the
> > > > VF rep port and OVN and OVS rules are all ready.
> > > not at that point nothign will have been done on ovn/ovs
> > >
> > > that will only happen after the port is bound to a vm and host.
> > >
> > > > What I should do in "openstack server create ..." is to JUST add PCI
> > > device
> > > > into VM, do NOT call neutron-server in nova-compute of compute node (
> > > like
> > > > call port_binding or something).
> > > this is incorrect.
> > > >
> > > > But as the log and steps said in the emails above, nova-compute call
> > > > port_binding to neutron-server while running the command "openstack
> > > server
> > > > create ...".
> > > >
> > > > So I still have questions is:
> > > > 1) Is my opinion right? Which is "JUST add PCI device into VM, do NOT
> > > call
> > > > neutron-server in nova-compute of compute node ( like call
> port_binding
> > > or
> > > > something)" .
> > > no this is not how its designed.
> > > until you attach the logical port to a vm (either at runtime or as
> part of
> > > vm create)
> > > the logical port is not assocated with any host or phsical dpu/vf.
> > >
> > > so its not possibel to instanciate the openflow rules in ovs form the
> > > logical switch model
> > > in the ovn north db as no chassie info has been populated and we do not
> > > have the dpu serial
> > > info in the port binding details.
> > > > 2) If it's right, how to deal with this? Which is how to JUST add PCI
> > > > device into VM, do NOT call neutron-server? By command or by
> configure?
> > > Is
> > > > there come document ?
> > > no this happens automaticaly when nova does the port binding which
> cannot
> > > happen until after
> > > teh vm is schduled to a host.
> > > >
> > > > ----
> > > > Simon Jones
> > > >
> > > >
> > > > Sean Mooney <smooney at redhat.com> 于2023年3月1日周三 16:15写道:
> > > >
> > > > > On Wed, 2023-03-01 at 15:20 +0800, Simon Jones wrote:
> > > > > > BTW, this link (
> > > > > >
> > > https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html)
> > > > > said
> > > > > > I SHOULD add "remote_managed" in /etc/nova/nova.conf, is that
> WRONG ?
> > > > >
> > > > > no its not wrong but for dpu smart nics you have to make a choice
> when
> > > you
> > > > > deploy
> > > > > either they can be used in dpu mode in which case remote_managed
> > > shoudl be
> > > > > set to true
> > > > > and you can only use them via neutron ports with
> > > vnic-type=remote_managed
> > > > > as descried in that doc
> > > > >
> > > > >
> > >
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html#launch-an-instance-with-remote-managed-port
> > > > >
> > > > >
> > > > > or if you disable dpu mode in the nic frimware then you shoudl
> remvoe
> > > > > remote_managed form the pci device list and
> > > > > then it can be used liek a normal vf either for neutron sriov ports
> > > > > vnic-type=direct or via flavor based pci passthough.
> > > > >
> > > > > the issue you were havign is you configured the pci device list to
> > > contain
> > > > > "remote_managed: ture" which means
> > > > > the vf can only be consumed by a neutron port with
> > > > > vnic-type=remote_managed, when you have "remote_managed: false" or
> > > unset
> > > > > you can use it via vnic-type=direct i forgot that slight detail
> that
> > > > > vnic-type=remote_managed is required for "remote_managed: ture".
> > > > >
> > > > >
> > > > > in either case you foudn the correct doc
> > > > >
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
> > > > > neutorn sriov port configuration is documented here
> > > > > https://docs.openstack.org/neutron/latest/admin/config-sriov.html
> > > > > and nova flavor based pci passthough is documeted here
> > > > > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
> > > > >
> > > > > all three server slightly differnt uses. both neutron proceedures
> are
> > > > > exclusivly fo network interfaces.
> > > > >
> https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html
> > > > > requires the use of ovn deployed on the dpu
> > > > > to configure the VF contolplane.
> > > > > https://docs.openstack.org/neutron/latest/admin/config-sriov.html
> uses
> > > > > the sriov nic agent
> > > > > to manage the VF with ip tools.
> > > > > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
> is
> > > > > intended for pci passthough
> > > > > of stateless acclerorators like qat devices. while the nova flavor
> > > approch
> > > > > cna be used with nics it not how its generally
> > > > > ment to be used and when used to passthough a nic expectation is
> that
> > > its
> > > > > not related to a neuton network.
> > > > >
> > > > >
> > >
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230302/19d42efc/attachment-0001.htm>


More information about the openstack-discuss mailing list