[neutron] OVS Hardware Offloading Performance Issues
Hello everyone, I've been testing OVS hardware offloaded ports for a while. I have ConnectX-5 and ConnectX-6 cards in my lab. Iperf tests are looking promising, but when I try to use hardware offloaded ports for krbd traffic, after some time, traffic falls back to the software path. Another interesting thing for me is without changing anything, it can also switch back to the hardware path. This issue mostly occurs while I am doing this test with 3 virtual machines on the same hypervisor. I'm not sure if this is a configuration issue or I'm hitting to cards limit. If any of you guys are experienced with this all ideas are more than welcome. Openstack version: Caracal Operation System: Ubuntu 22.04 Thank you, Serhat
Hello, I think it may be related to ovs flooding. Offloading is working when there is no traffic flooding. I have observed flooded traffic when things slowed down. My latest test scenario has 2 VMs within the same vlan on a hypervisor and set explicitly_egress_direct = True. Thank you, Serhat On Mon, May 5, 2025 at 4:41 PM Serhat Rıfat Demircan < demircan.serhat@gmail.com> wrote:
Hello everyone,
I've been testing OVS hardware offloaded ports for a while. I have ConnectX-5 and ConnectX-6 cards in my lab. Iperf tests are looking promising, but when I try to use hardware offloaded ports for krbd traffic, after some time, traffic falls back to the software path. Another interesting thing for me is without changing anything, it can also switch back to the hardware path. This issue mostly occurs while I am doing this test with 3 virtual machines on the same hypervisor.
I'm not sure if this is a configuration issue or I'm hitting to cards limit. If any of you guys are experienced with this all ideas are more than welcome.
Openstack version: Caracal Operation System: Ubuntu 22.04
Thank you, Serhat
After little more debugging I see tc flow change: *From Offloaded:* $ tc filter show dev ens37f0npf0vf6 ingress filter protocol ip pref 2 flower chain 0 filter protocol ip pref 2 flower chain 0 handle 0x1 dst_mac 00:1c:73:aa:bb:cc src_mac fa:16:7e:2e:8c:f0 eth_type ipv4 ip_flags nofrag skip_sw in_hw in_hw_count 1 action order 1: vlan push id 111 protocol 802.1Q priority 0 pipe index 4 ref 1 bind 1 no_percpu used_hw_stats delayed action order 2: mirred (Egress Redirect to device openstack) stolen index 4 ref 1 bind 1 cookie 605b82e5b94ef3fa63c3e78349851a27 no_percpu used_hw_stats delayed *To software path:* ubuntu@noted-lynx:~$ tc filter show dev ens37f0npf0vf6 ingress filter protocol ip pref 2 flower chain 0 filter protocol ip pref 2 flower chain 0 handle 0x1 dst_mac 00:1c:73:aa:bb:cc src_mac fa:16:7e:2e:8c:f0 eth_type ipv4 ip_flags nofrag not_in_hw action order 1: vlan push id 111 protocol 802.1Q priority 0 pipe index 4 ref 1 bind 1 no_percpu action order 2: skbedit ptype host pipe index 11 ref 1 bind 1 action order 3: mirred (Ingress Mirror to device br-ex) pipe index 13 ref 1 bind 1 cookie 5037716e444e83b81c4ef9844df2377d no_percpu action order 4: mirred (Egress Redirect to device openstack) stolen index 15 ref 1 bind 1 cookie 5037716e444e83b81c4ef9844df2377d no_percpu Also it is not in offloaded flows: # ovs-appctl dpctl/dump-flows type=offloaded | grep 00:1c:73:aa:bb:cc # ovs-appctl dpctl/dump-flows | grep 00:1c:73:aa:bb:cc recirc_id(0),in_port(4),eth(src=fa:16:7e:2e:8c:f0,dst=00:1c:73:aa:bb:cc),eth_type(0x0800),ipv4(frag=no), packets:7819823, bytes:27047973132, used:2.000s, actions:push_vlan(vid=111,pcp=0),1,2 On Tue, May 6, 2025 at 10:11 AM Serhat Rıfat Demircan < demircan.serhat@gmail.com> wrote:
Hello,
I think it may be related to ovs flooding. Offloading is working when there is no traffic flooding. I have observed flooded traffic when things slowed down.
My latest test scenario has 2 VMs within the same vlan on a hypervisor and set explicitly_egress_direct = True.
Thank you, Serhat
On Mon, May 5, 2025 at 4:41 PM Serhat Rıfat Demircan < demircan.serhat@gmail.com> wrote:
Hello everyone,
I've been testing OVS hardware offloaded ports for a while. I have ConnectX-5 and ConnectX-6 cards in my lab. Iperf tests are looking promising, but when I try to use hardware offloaded ports for krbd traffic, after some time, traffic falls back to the software path. Another interesting thing for me is without changing anything, it can also switch back to the hardware path. This issue mostly occurs while I am doing this test with 3 virtual machines on the same hypervisor.
I'm not sure if this is a configuration issue or I'm hitting to cards limit. If any of you guys are experienced with this all ideas are more than welcome.
Openstack version: Caracal Operation System: Ubuntu 22.04
Thank you, Serhat
participants (1)
-
Serhat Rıfat Demircan