Hello OpenStack community,
I'm trying to set up OpenStack with Nvidia ASAP2 hardware offloading in switchdev mode according to
OpenStack guidelines and
Nvidia guidelines on the ConnectX6DX.
I've tried different network types VLAN and VXLAN but offloading for none of them works.
From an OpenStack standpoint everything is configured according to documentation, representor device is created and plugged in the br-int bridge (eth14) physical. PF enp33s0f0np0 is set to switchdev mode
```
devlink dev eswitch show pci/0000:21:00.0
pci/0000:21:00.0: mode switchdev inline-mode none encap-mode basic
```
```
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl show
12e00cdb-6520-4666-908f-57f59e8b9bd4
Manager "ptcp:6640:127.0.0.1"
is_connected: true
Bridge br-ex
Controller "tcp:
127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-ex
Interface br-ex
type: internal
Port pr-floating
Interface pr-floating
Port phy-br-ex
Interface phy-br-ex
type: patch
options: {peer=int-br-ex}
Bridge br-int
Controller "tcp:
127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port br-int
Interface br-int
type: internal
Port int-br-tenant
Interface int-br-tenant
type: patch
options: {peer=phy-br-tenant}
Port tap58ac2b17-fb
tag: 1
Interface tap58ac2b17-fb
Port int-br-ex
Interface int-br-ex
type: patch
options: {peer=phy-br-ex}
Port eth14
tag: 1
Interface eth14
Bridge br-tenant
Controller "tcp:
127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-tenant
Interface br-tenant
type: internal
Port phy-br-tenant
Interface phy-br-tenant
type: patch
options: {peer=int-br-tenant}
Port enp33s0f0np0
Interface enp33s0f0np0
Bridge br-tun
Controller "tcp:
127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-tun
Interface br-tun
type: internal
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
ovs_version: "3.0.0-0056-25.04-based-3.3.5"
```
But hardware offloading is not happening, tcpdump on eth14 still shows all packets, and no TC rules are created.
OpenStack uses complex openvswitch rules to setup virtual networking, in this scenario vlan tagging is done by the following rule
```
cookie=0x45b91976d5c4f175, duration=233.443s, table=60, n_packets=314, n_bytes=35574, priority=100,in_port=eth14 actions=set_field:0x11->reg5,set_field:0x1->reg6,resubmit(,73)
```
If we simplify openflow rules and align them with Nvidia guidelines for openvswitch hardware offloading by plugging representor device directly into physnet bridge, and assign vlan tag as port attribute hardware offloading starts working.
```
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl del-port eth14
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl add-port br-tenant eth14
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl set port eth14 tag=1443
```
```
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# for i in $(ip link | awk '{print $2}' |grep -e enp -e ens |grep -v vf$); do tc filter show dev $i ingress; done
filter protocol 802.1Q pref 6 flower chain 0
filter protocol 802.1Q pref 6 flower chain 0 handle 0x1
vlan_id 1443
vlan_prio 0
vlan_ethtype arp
dst_mac fa:16:3e:36:1b:90
src_mac fa:16:3e:39:90:d8
eth_type arp
in_hw in_hw_count 1
action order 1: vlan pop pipe
index 1 ref 1 bind 1
no_percpu
used_hw_stats delayed
action order 2: mirred (Egress Redirect to device eth14) stolen
index 1 ref 1 bind 1
cookie 4b5b5a701f4e3fee7ac1c198037522e3
no_percpu
used_hw_stats delayed
```
Similar issue occurs when using the VXLAN network.
OpenStack setup VXLAN tunnel like, where tunnel_id is inserted dynamically via openflow rules (in_key=flow)
```
Port vxlan-0a0c0004
Interface vxlan-0a0c0004
type: vxlan
options: {df_default="true", dst_port="4790", egress_pkt_mark="0", in_key=flow, local_ip="10.12.0.9", out_key=flow, remote_ip="10.12.0.4"}
```
In this configuration hardware offloading is not happening neither. But when setting up vxlan according to nvidia guidelines (no complex rules are involved) hardware offloading starts working (tcpdump for representor device shows only first packet and tc rules are being created).
```
Port vxlan3
Interface vxlan3
type: vxlan
options: {key="60987", local_ip="10.12.0.9", remote_ip="10.12.0.4"}
```
```
Cannot find device "enp33s0f0npf0vf"
Cannot find device "veth68d427d@if1"
filter protocol ip pref 2 flower chain 0
filter protocol ip pref 2 flower chain 0 handle 0x1
dst_mac fa:16:3e:65:0c:6d
src_mac 52:5a:d0:9e:8a:32
eth_type ipv4
enc_dst_ip 10.12.0.9
enc_src_ip 10.12.0.8
enc_key_id 60987
enc_dst_port 4789
enc_tos 0
ip_flags nofrag
in_hw in_hw_count 2
action order 1: tunnel_key unset pipe
index 2 ref 1 bind 1
no_percpu
used_hw_stats delayed
action order 2: mirred (Egress Redirect to device eth15) stolen
index 2 ref 1 bind 1
cookie 5d44cdf6754fc237cc0bd688a6fe70a8
no_percpu
used_hw_stats delayed
```
Am I doing something wrong? Is there anyone who has other results?