we technially NVIDIA ASAP2 offloading is not suproted by neutron in tree ASAP2 is there out of tree fork of ovs the hardware offloaed ovs supprot in core neutorn via ml2/ovs and ml2/ovn officlaly only supprot upstream ovs and the upstream in kernel hardware offloads. https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html are the docs for what neutron supprot but its not the saame as asap2 it supprot https://docs.openvswitch.org/en/latest/howto/tc-offload/ instead of https://docs.nvidia.com/networking/display/mlnxofedv23104091lts/ovs+offload+... On 30/07/2025 18:21, Stig Telfer wrote:
Hello Vasyl -
Yes it does work, although it's not always easy to work with!
These OVS debug options might help you work out what is preventing hardware offloading:
ovs-appctl vlog/set file,dpif,dbg ovs-appctl vlog/set file,dpif_netlink,dbg ovs-appctl vlog/set file,netdev_offload_tc,dbg
Best wishes, Stig
On 29 Jul 2025, at 07:34, Vasyl Saienko <vsaienko@mirantis.com> wrote:
Hello OpenStack community,
I'm trying to set up OpenStack with Nvidia ASAP2 hardware offloading in switchdev mode according to OpenStack guidelines <https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html> and Nvidia guidelines <https://docs.nvidia.com/doca/sdk/ovs-kernel+hardware+offloads/index.html> on the ConnectX6DX.
I've tried different network types VLAN and VXLAN but offloading for none of them works.
From an OpenStack standpoint everything is configured according to documentation, representor device is created and plugged in the br-int bridge (eth14) physical. PF enp33s0f0np0 is set to switchdev mode
``` devlink dev eswitch show pci/0000:21:00.0 pci/0000:21:00.0: mode switchdev inline-mode none encap-mode basic ```
``` root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl show 12e00cdb-6520-4666-908f-57f59e8b9bd4 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-ex Controller "tcp:127.0.0.1:6633 <http://127.0.0.1:6633/>" is_connected: true fail_mode: secure datapath_type: system Port br-ex Interface br-ex type: internal Port pr-floating Interface pr-floating Port phy-br-ex Interface phy-br-ex type: patch options: {peer=int-br-ex} Bridge br-int Controller "tcp:127.0.0.1:6633 <http://127.0.0.1:6633/>" is_connected: true fail_mode: secure datapath_type: system Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port br-int Interface br-int type: internal Port int-br-tenant Interface int-br-tenant type: patch options: {peer=phy-br-tenant} Port tap58ac2b17-fb tag: 1 Interface tap58ac2b17-fb Port int-br-ex Interface int-br-ex type: patch options: {peer=phy-br-ex} Port eth14 tag: 1 Interface eth14 Bridge br-tenant Controller "tcp:127.0.0.1:6633 <http://127.0.0.1:6633/>" is_connected: true fail_mode: secure datapath_type: system Port br-tenant Interface br-tenant type: internal Port phy-br-tenant Interface phy-br-tenant type: patch options: {peer=int-br-tenant} Port enp33s0f0np0 Interface enp33s0f0np0 Bridge br-tun Controller "tcp:127.0.0.1:6633 <http://127.0.0.1:6633/>" is_connected: true fail_mode: secure datapath_type: system Port br-tun Interface br-tun type: internal Port patch-int Interface patch-int type: patch options: {peer=patch-tun} ovs_version: "3.0.0-0056-25.04-based-3.3.5"
```
But hardware offloading is not happening, tcpdump on eth14 still shows all packets, and no TC rules are created.
OpenStack uses complex openvswitch rules to setup virtual networking, in this scenario vlan tagging is done by the following rule
``` cookie=0x45b91976d5c4f175, duration=233.443s, table=60, n_packets=314, n_bytes=35574, priority=100,in_port=eth14 actions=set_field:0x11->reg5,set_field:0x1->reg6,resubmit(,73) ```
If we simplify openflow rules and align them with Nvidia guidelines for openvswitch hardware offloading by plugging representor device directly into physnet bridge, and assign vlan tag as port attribute hardware offloading starts working.
``` root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl del-port eth14 root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl add-port br-tenant eth14 root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl set port eth14 tag=1443 ```
``` root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# for i in $(ip link | awk '{print $2}' |grep -e enp -e ens |grep -v vf$); do tc filter show dev $i ingress; done
filter protocol 802.1Q pref 6 flower chain 0 filter protocol 802.1Q pref 6 flower chain 0 handle 0x1 vlan_id 1443 vlan_prio 0 vlan_ethtype arp dst_mac fa:16:3e:36:1b:90 src_mac fa:16:3e:39:90:d8 eth_type arp in_hw in_hw_count 1 action order 1: vlan pop pipe index 1 ref 1 bind 1 no_percpu used_hw_stats delayed
action order 2: mirred (Egress Redirect to device eth14) stolen index 1 ref 1 bind 1 cookie 4b5b5a701f4e3fee7ac1c198037522e3 no_percpu used_hw_stats delayed ```
Similar issue occurs when using the VXLAN network. OpenStack setup VXLAN tunnel like, where tunnel_id is inserted dynamically via openflow rules (in_key=flow)
``` Port vxlan-0a0c0004 Interface vxlan-0a0c0004 type: vxlan options: {df_default="true", dst_port="4790", egress_pkt_mark="0", in_key=flow, local_ip="10.12.0.9", out_key=flow, remote_ip="10.12.0.4"} ``` In this configuration hardware offloading is not happening neither. But when setting up vxlan according to nvidia guidelines (no complex rules are involved) hardware offloading starts working (tcpdump for representor device shows only first packet and tc rules are being created).
``` Port vxlan3 Interface vxlan3 type: vxlan options: {key="60987", local_ip="10.12.0.9", remote_ip="10.12.0.4"} ```
``` Cannot find device "enp33s0f0npf0vf" Cannot find device "veth68d427d@if1" filter protocol ip pref 2 flower chain 0 filter protocol ip pref 2 flower chain 0 handle 0x1 dst_mac fa:16:3e:65:0c:6d src_mac 52:5a:d0:9e:8a:32 eth_type ipv4 enc_dst_ip 10.12.0.9 enc_src_ip 10.12.0.8 enc_key_id 60987 enc_dst_port 4789 enc_tos 0 ip_flags nofrag in_hw in_hw_count 2 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 no_percpu used_hw_stats delayed
action order 2: mirred (Egress Redirect to device eth15) stolen index 2 ref 1 bind 1 cookie 5d44cdf6754fc237cc0bd688a6fe70a8 no_percpu used_hw_stats delayed ```
Am I doing something wrong? Is there anyone who has other results? -- <https://www.mirantis.com/>
Vasyl Saienko
Principal DevOps Engineer
vsaienko@mirantis.com <mailto:dstoltenberg@mirantis.com>
+(380) 66 072 07 17 <tel:++1+(650)+564+7038>