Hi Arnaud, Are you using agent_mode=dvr_snat on computes? That is unsupported: https://review.opendev.org/c/openstack/neutron/+/801503 -Brian On 12/12/22 4:30 AM, Arnaud Morin wrote:
Hello,
My subnet is: 192.168.43.0/24
My router is: 192.168.43.1
My ironic server is: 192.168.43.43
When I do a ping against router from server: $ ping -c5 192.168.43.1 PING 192.168.43.1 (192.168.43.1) 56(84) bytes of data. 64 bytes from 192.168.43.1: icmp_seq=1 ttl=64 time=0.458 ms 64 bytes from 192.168.43.1: icmp_seq=1 ttl=64 time=0.899 ms (DUP!) 64 bytes from 192.168.43.1: icmp_seq=2 ttl=64 time=0.372 ms 64 bytes from 192.168.43.1: icmp_seq=2 ttl=64 time=0.399 ms (DUP!) 64 bytes from 192.168.43.1: icmp_seq=3 ttl=64 time=0.484 ms 64 bytes from 192.168.43.1: icmp_seq=3 ttl=64 time=0.485 ms (DUP!) 64 bytes from 192.168.43.1: icmp_seq=4 ttl=64 time=0.411 ms 64 bytes from 192.168.43.1: icmp_seq=4 ttl=64 time=0.411 ms (DUP!) 64 bytes from 192.168.43.1: icmp_seq=5 ttl=64 time=0.299 ms
--- 192.168.43.1 ping statistics --- 5 packets transmitted, 5 received, +4 duplicates, 0% packet loss, time 4101ms rtt min/avg/max/mdev = 0.299/0.468/0.899/0.161 ms
We can see the DUP! which are coming from the 2 SNAT nodes that I have (I am using max_l3_agents_per_router=2).
Cheers
On 12.12.22 - 10:11, Rodolfo Alonso Hernandez wrote:
Hello Arnaud:
You said "all distributed routers are answering to ARP and ICMP, thus creating duplicates in the network". To what IP addresses are the DVR routers replying?
Regards.
On Mon, Dec 12, 2022 at 10:01 AM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hello team,
When using router in DVR (+ HA), we end-up having the router on all computes where needed.
So far, this is nice.
We want to introduce Ironic baremetal servers, with a private network access. DVR won't apply on such baremetal servers, and we know floating IP are not going to work.
Anyway, we were thinking that SNAT part would be OK. After doing few tests, we noticed that all distributed routers are answering to ARP and ICMP, thus creating duplicates in the network.
$ arping -c1 192.168.43.1 ARPING 192.168.43.1 60 bytes from fa:16:3f:67:97:6a (192.168.43.1): index=0 time=634.700 usec 60 bytes from fa:16:3f:dc:67:91 (192.168.43.1): index=1 time=750.298 usec
--- 192.168.43.1 statistics --- 1 packets transmitted, 2 packets received, 0% unanswered (1 extra)
Is there anything possible on neutron side to prevent this?
FYI, I did a comparison with routers in centralized mode (+ HA). In that situation, keepalived is putting the qr-xxx interface down in qrouter namespace. In distributed mode, keepalives is running in snat- namespace and cannot manage the router interface.
Any help / tip would be appreciated.
Thanks!
Arnaud.