Hi everyone!
I’m facing a weird situation on a tenant of one of our Openstack cluster based on Victoria.
On this tenant, the network topology is as follow:
One DMZ network (
192.168.0.0/24) linked to our public network through a neutron router where there is a VM acting as a bastion/router for the MGMT network.
On the DMZ network, there is a linux Debian 11, let’s call it VM-A with a Floating IP from the public pool, this VM is both attached to the DMZ network (ens3 / 192.168.0.12) AND the MGMT network (ens4 / 172.16.31.23).
All other VMs, let’s call them VM-X are exclusively attached to the MGMT network (ens4).
I’ve setup VM-A with ip_forward kernel module and the following iptables rule:
# iptables -t nat -A POSTROUTING -o ens3 -J SNAT —to-source 192.168.0.12
My VM-X are on their own setup with a default gateway via VM-A:
# ip route add default via 172.31.16.23
The setup seems to be working as if I don’t put the iptables rule and the kernel forwarding I can’t see any packets on my DMZ interface (ens3) on VM-A from VM-X.
Ok so now that you get the whole schema, let dive into the issue.
So when all rules, modules and gateway are set, I can fully see my VM-X traffic (ICMP ping to a dns server) going from VM-X (ens4) to VM-A (ens4) then forwarded to VM-A (ens3) and finally going to our public IP targeted service.
What’s not working however is the response not reaching back to VM-X.
I’ve tcpdump the whole traffic from VM-X to VM-A on each point of the platfrom:
from inside the VM-X nic, on the tap device, on the qbr bridge, on the qvb veth, on the qvo second side of the veth through the ovs bridges and vice-versa.
However the response packets aren’t reaching back further than on the VM-A qvo veth.
Once it exit the VM-A the traffic never reaches the VM-X.
What’s really suspicious in here is that a direct ping from VM-X (172.16.31.54) to VM-A (172.16.31.23) is coming back correctly, so it looks like if ovs detected that the response on a SNAT case isn’t legit or something similar.
Is anyone able to get such setup working?
Here are few additional information:
Host runs on CentOS 8.5 latest update.
Our platform is a Openstack Victoria deployed using kolla-ansible.
We are using a OVS based deployment.
Our tunnels are VXLAN.
All VMs have a fully open secgroup applied and all ports have it (I checked it twice and even on host iptables).
If you ever need additional information feel free to let me know !