SNAT failure with OVN under Antelope

Yatin Karel ykarel at redhat.com
Tue Jun 27 14:37:45 UTC 2023


Hi Gary,

On top what Rodolfo said
On Tue, Jun 27, 2023 at 5:15 PM Gary Molenkamp <molenkam at uwo.ca> wrote:

> Good morning,   I'm having a problem with snat routing under OVN but I'm
> not sure if something is mis-configured or just my understanding of how
> OVN is architected is wrong.
>
> I've built a Zed cloud, since upgraded to Antelope, using the Neutron
> Manual install method here:
> https://docs.openstack.org/neutron/latest/install/ovn/manual_install.html
> I'm using a multi-tenent configuration using geneve and the flat
> provider network is present on each hypervisor. Each hypervisor is
> connected to the physical provider network, along with the tenent
> network and is tagged as an external chassis under OVN.
>          br-int exists, as does br-provider
>          ovs-vsctl set open .
> external-ids:ovn-cms-options=enable-chassis-as-gw
>

Any specific reason to enable gateway on compute nodes? Generally it's
recommended to use controller/network nodes as gateway. What's your
env(number of controllers, network, compute nodes)?


> For most cases, distributed FIP based connectivity is working without
> issue, but I'm having an issue where VMs without a FIP are not always
> able to use the SNAT services of the tenent network router.
> Scenario:
>      Internal network named cs3319:  with subnet 172.31.100.0/23
>      Has a router named cs3319_router with external gateway set (snat
> enabled)
>
>      This network has 3 vms:
>          - #1 has a FIP and can be accessed externally
>          - #2 has no FIP, can be accessed via VM1 and can access
> external resources via SNAT  (ie OS repos, DNS, etc)
>          - #3 has no FIP, can be accessed via VM1 but has no external
> SNAT connectivity
>
> Considering it works for some vm but for some not, the above point for
enable-chassis-as-gw could be related.
The working vm is hosted on compute05 or some other compute node? Where is
the gateway router port scheduled(can check ovn-sbctl show for
cr-lrp-<router gateway port id>)?


>  From what I can tell,  the chassis config is correct, compute05 is the
> hypervisor and the faulty VM has a port binding on this hypervisor:
>
> ovn-sbctl show
> ...
> Chassis "8e0fa17c-e480-4b60-9015-bd8833412561"
>      hostname: compute05.cloud.sci.uwo.ca
>      Encap geneve
>          ip: "192.168.0.105"
>          options: {csum="true"}
>      Port_Binding "7a5257eb-caea-45bf-b48c-620c5dff4b39"
>      Port_Binding "50e16602-78e6-429b-8c2f-e7e838ece1b4"
>      Port_Binding "f121c9f4-c3fe-4ea9-b754-a809be95a3fd"
>
> The router has the candidate gateways, and the snat set:
>
> ovn-nbctl show  92df19a7-4ebe-43ea-b233-f4e9f5a46e7c
> router 92df19a7-4ebe-43ea-b233-f4e9f5a46e7c
> (neutron-389439b5-07f8-44b6-a35b-c76651b48be5) (aka cs3319_public_router)
>      port lrp-44ae1753-845e-4822-9e3d-a41e0469e257
>          mac: "fa:16:3e:9a:db:d8"
>          networks: ["129.100.21.94/22"]
>          gateway chassis: [5c039d38-70b2-4ee6-9df1-596f82c68106
> 99facd23-ad17-4b68-a8c2-1ff6da15ac5f
> 1694116c-6d30-4c31-b5ea-0f411878316e
> 2a4bbaf9-228a-462e-8970-0cdbf59086e6 9332c61b-93e1-4a70-9547-701a014bfd98]
>      port lrp-509bba37-fa06-42d6-9210-2342045490db
>          mac: "fa:16:3e:ff:0f:3b"
>          networks: ["172.31.100.1/23"]
>      nat 11e0565a-4695-4f67-b4ee-101f1b1b9a4f
>          external ip: "129.100.21.94"
>          logical ip: "172.31.100.0/23"
>          type: "snat"
>      nat 21e4be02-d81c-46e8-8fa8-3f94edb4aed1
>          external ip: "129.100.21.87"
>          logical ip: "172.31.100.49"
>          type: "dnat_and_snat"
>
> Each network agent on the hypervisors shows the ovn controller up :
>       OVN Controller Gateway agent | compute05.cloud.sci.uwo.ca
> |                   | :-)   | UP    | ovn-controller
>
> The ovs vswitch on the hypervisor looks correct afaict and ovn ports bfd
> status are all forwarding to other hypervisors. ie:
>     Port ovn-2a4bba-0
>              Interface ovn-2a4bba-0
>                  type: geneve
>                  options: {csum="true", key=flow,
> remote_ip="192.168.0.106"}
>                  bfd_status: {diagnostic="No Diagnostic",
> flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic",
> remote_state=up, state=up}
>
>
> Any advice on where to look would be appreciated.
>
> I have seen mtu specific issues in the past, would be good to rule out any
mtu issue with working and non working cases.

PS.  Version info:
>      Neutron 22.0.0-1
>      OVN 22.12
>
>     neutron options:
>        enable_distributed_floating_ip = true
>        ovn_l3_scheduler = leastloaded
>
>
>
> Thanks
> Gary
>
>
>
> --
> Gary Molenkamp                  Science Technology Services
> Systems/Cloud Administrator     University of Western Ontario
> molenkam at uwo.ca                  http://sts.sci.uwo.ca
> (519) 661-2111 x86882           (519) 661-3566
>
>
> Thanks and Regards
Yatin Karel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230627/f2aab019/attachment.htm>


More information about the openstack-discuss mailing list