[neutron][ironic] Distributed routers and SNAT
Arnaud Morin
arnaud.morin at gmail.com
Mon Dec 12 15:41:00 UTC 2022
Here is a small image that explain the issue.
Both network nodes are hosting the same router, in HA.
One if MASTER, other is BACKUP.
BUT, both of them are still accessible/answering.
Cheers
On 12.12.22 - 15:33, Arnaud Morin wrote:
> Yes, network nodes and baremetal nodes are on the same physical network.
> I want the baremetal to use the neutron routers as SNAT gateways,
> just like a regular instance.
>
> Within an hypervisor, the instance is having a small qrouter- namespace
> (DVR) which is acting as local router before forwarding the traffic to
> the network (SNAT) node (this is done in OVS with openflow rules).
>
> From a baremetal perspective, I dont have this small DVR, so I am
> reaching the router on the SNAT nodes, which are both answering.
>
>
>
> On 12.12.22 - 06:46, Julia Kreger wrote:
> > So for clarification, just so we're all on the same page. You
> > have dedicated network nodes, which are running the agent, and the bare
> > metal nodes are obviously wired into them on the same logical network,
> >
> > https://bugs.launchpad.net/neutron/+bug/1934666 refers only to on compute
> > nodes, which seems different from this configuration.
> >
> > On Mon, Dec 12, 2022 at 6:36 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I am not, on computes I am using agent_mode=dvr
> > > on network nodes, I am using agent_mode=dvr_snat
> > >
> > > Note that the computes routers are also answering as soon an instance
> > > lives on it (or a dhcp agent hosting the network).
> > >
> > > Arnaud
> > >
> > > On 12.12.22 - 09:19, Brian Haley wrote:
> > > > Hi Arnaud,
> > > >
> > > > Are you using agent_mode=dvr_snat on computes? That is unsupported:
> > > >
> > > > https://review.opendev.org/c/openstack/neutron/+/801503
> > > >
> > > > -Brian
> > > >
> > > > On 12/12/22 4:30 AM, Arnaud Morin wrote:
> > > > > Hello,
> > > > >
> > > > > My subnet is: 192.168.43.0/24
> > > > >
> > > > > My router is: 192.168.43.1
> > > > >
> > > > > My ironic server is: 192.168.43.43
> > > > >
> > > > > When I do a ping against router from server:
> > > > > $ ping -c5 192.168.43.1
> > > > > PING 192.168.43.1 (192.168.43.1) 56(84) bytes of data.
> > > > > 64 bytes from 192.168.43.1: icmp_seq=1 ttl=64 time=0.458 ms
> > > > > 64 bytes from 192.168.43.1: icmp_seq=1 ttl=64 time=0.899 ms (DUP!)
> > > > > 64 bytes from 192.168.43.1: icmp_seq=2 ttl=64 time=0.372 ms
> > > > > 64 bytes from 192.168.43.1: icmp_seq=2 ttl=64 time=0.399 ms (DUP!)
> > > > > 64 bytes from 192.168.43.1: icmp_seq=3 ttl=64 time=0.484 ms
> > > > > 64 bytes from 192.168.43.1: icmp_seq=3 ttl=64 time=0.485 ms (DUP!)
> > > > > 64 bytes from 192.168.43.1: icmp_seq=4 ttl=64 time=0.411 ms
> > > > > 64 bytes from 192.168.43.1: icmp_seq=4 ttl=64 time=0.411 ms (DUP!)
> > > > > 64 bytes from 192.168.43.1: icmp_seq=5 ttl=64 time=0.299 ms
> > > > >
> > > > > --- 192.168.43.1 ping statistics ---
> > > > > 5 packets transmitted, 5 received, +4 duplicates, 0% packet loss, time
> > > > > 4101ms
> > > > > rtt min/avg/max/mdev = 0.299/0.468/0.899/0.161 ms
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > We can see the DUP! which are coming from the 2 SNAT nodes that I have
> > > > > (I am using max_l3_agents_per_router=2).
> > > > >
> > > > >
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On 12.12.22 - 10:11, Rodolfo Alonso Hernandez wrote:
> > > > > > Hello Arnaud:
> > > > > >
> > > > > > You said "all distributed routers are answering to ARP and ICMP, thus
> > > > > > creating duplicates in the network". To what IP addresses are the DVR
> > > > > > routers replying?
> > > > > >
> > > > > > Regards.
> > > > > >
> > > > > >
> > > > > > On Mon, Dec 12, 2022 at 10:01 AM Arnaud Morin <
> > > arnaud.morin at gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello team,
> > > > > > >
> > > > > > > When using router in DVR (+ HA), we end-up having the router on all
> > > > > > > computes where needed.
> > > > > > >
> > > > > > > So far, this is nice.
> > > > > > >
> > > > > > > We want to introduce Ironic baremetal servers, with a private
> > > network
> > > > > > > access.
> > > > > > > DVR won't apply on such baremetal servers, and we know floating IP
> > > are
> > > > > > > not going to work.
> > > > > > >
> > > > > > > Anyway, we were thinking that SNAT part would be OK.
> > > > > > > After doing few tests, we noticed that all distributed routers are
> > > > > > > answering to ARP and ICMP, thus creating duplicates in the network.
> > > > > > >
> > > > > > > $ arping -c1 192.168.43.1
> > > > > > > ARPING 192.168.43.1
> > > > > > > 60 bytes from fa:16:3f:67:97:6a (192.168.43.1): index=0
> > > time=634.700 usec
> > > > > > > 60 bytes from fa:16:3f:dc:67:91 (192.168.43.1): index=1
> > > time=750.298 usec
> > > > > > >
> > > > > > > --- 192.168.43.1 statistics ---
> > > > > > > 1 packets transmitted, 2 packets received, 0% unanswered (1
> > > extra)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Is there anything possible on neutron side to prevent this?
> > > > > > >
> > > > > > >
> > > > > > > FYI, I did a comparison with routers in centralized mode (+ HA).
> > > > > > > In that situation, keepalived is putting the qr-xxx interface down
> > > in
> > > > > > > qrouter namespace.
> > > > > > > In distributed mode, keepalives is running in snat- namespace and
> > > cannot
> > > > > > > manage the router interface.
> > > > > > >
> > > > > > > Any help / tip would be appreciated.
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Arnaud.
> > > > > > >
> > > > > > >
> > > > >
> > >
> > >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: s_1670859430.png
Type: image/png
Size: 123653 bytes
Desc: not available
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20221212/ad6fe3e7/attachment-0001.png>
More information about the openstack-discuss
mailing list