SNAT failure with OVN under Antelope

Gary Molenkamp molenkam at uwo.ca
Wed Jul 12 15:43:12 UTC 2023


A little progress, but I may be tripping over bug 
https://bugs.launchpad.net/neutron/+bug/2003455

If I remove the provider bridge from the second hypervisor:
     ovs-vsctl remove open . external-ids 
ovn-cms-options="enable-chassis-as-gw"
     ovs-vsctl remove open . external-ids ovn-bridge-mappings
     ip link set br-provider down
     ovs-vsctl del-br br-provider
and disable
     enable_distributed_floating_ip

Then both VMs using SNAT on each compute server work.

Turning the second chassis back on as a gateway immediately breaks the 
VM on the second compute server:

     ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw
     ovs-vsctl add-br br-provider
     ovs-vsctl set open . 
external-ids:ovn-bridge-mappings=provider:br-provider
     ovs-vsctl add-port br-provider ens256
     systemctl restart ovn-controller openvswitch.service

I am running neutron 22.0.1 but maybe something related?

python3-neutron-22.0.1-1.el9s.noarch
openstack-neutron-common-22.0.1-1.el9s.noarch
openstack-neutron-22.0.1-1.el9s.noarch
openstack-neutron-ml2-22.0.1-1.el9s.noarch
openstack-neutron-openvswitch-22.0.1-1.el9s.noarch
openstack-neutron-ovn-metadata-agent-22.0.1-1.el9s.noarch






On 2023-07-12 10:21, Gary Molenkamp wrote:
> For comparison, I looked at how openstack-ansible was setting up OVN 
> and I don't see any major differences other than O-A configures a 
> manager for ovs:
>       ovs-vsctl --id @manager create Manager "target=\ ....
> I don't believe this is the point of failure (but feel free to correct 
> me if I'm wrong ;) ).
>
> ovn-trace on both VM's inports shows the same trace for the working VM 
> and the non-working VM. ie:
>
> ovn-trace --db=$SB --ovs default_net 'inport == 
> "f4cbc8c7-e7bf-47f3-9fea-a1663f6eb34d" && eth.src==fa:16:3e:a6:62:8e 
> && ip4.src == 172.31.101.168 && ip4.dst == <provider's gateway IP>'
>
>
>
> On 2023-07-07 14:08, Gary Molenkamp wrote:
>> Happy Friday afternoon.
>>
>> I'm still pondering a lack of connectivity in an HA OVN with each 
>> compute node acting as a potential gateway chassis.
>>
>>>>     The problem is basically that the port of the OVN LRP may not
>>>>     be in the same chassis as the VM that failed (since the CR-LRP
>>>>     will be where the first VM of that network will be
>>>>     created). The suggestion is to remove the enable-chassis-as-gw
>>>>     from the compute nodes to allow the VM to forward traffic via
>>>>     tunneling/Geneve to the chassis where the LRP resides.
>>>>
>>>
>>>     I forced a similar VM onto the same chassis as the working VM,
>>>     and it was able to communicate out.    If we do want to keep
>>>     multiple chassis' as gateways, would that be addressed with the
>>>     ovn-bridge-mappings?
>>>
>>>
>>
>> I built a small test cloud to explore this further as I continue to 
>> see the same issue:  A vm will only be able to use SNAT outbound if 
>> it is on the same chassis as the CR-LRP.
>>
>> In my test cloud, I have one controller, and two compute nodes. The 
>> controller only runs the north and southd in addition to the neutron 
>> server.  Each of the two compute nodes is configured as below.  On a 
>> tenent network I have three VMs:
>>     - #1:  cirros VM with FIP
>>     - #2:  cirros VM running on compute node 1
>>     - #3:  cirros VM running on compute node 2
>>
>> E/W traffic between VMs in the same tenent network are fine. N/S 
>> traffic is fine for the FIP.  N/S traffic only works for the VM whose 
>> CR-LRP is active on same chassis.   Does anything jump out as a 
>> mistake in my understanding at to how this should be working?
>>
>> Thanks as always,
>> Gary
>>
>>
>> on each hypervisor:
>>
>> /usr/bin/ovs-vsctl set open . external-ids:ovn-remote=tcp:{{ 
>> controllerip }}:6642
>> /usr/bin/ovs-vsctl set open . external-ids:ovn-encap-type=geneve
>> /usr/bin/ovs-vsctl set open . external-ids:ovn-encap-ip={{ 
>> overlaynetip }}
>> /usr/bin/ovs-vsctl set open . 
>> external-ids:ovn-cms-options=enable-chassis-as-gw
>> /usr/bin/ovs-vsctl add-br br-provider -- set bridge br-provider 
>> protocols=OpenFlow10,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15
>> /usr/bin/ovs-vsctl add-port br-provider {{ provider_nic }}
>> /usr/bin/ovs-vsctl br-set-external-id provider bridge-id br-provider
>> /usr/bin/ovs-vsctl set open . 
>> external-ids:ovn-bridge-mappings=provider:br-provider
>>
>> plugin.ini:
>> [ml2]
>> mechanism_drivers = ovn
>> type_drivers = flat,geneve
>> tenant_network_types = geneve
>> extension_drivers = port_security
>> overlay_ip_version = 4
>> [ml2_type_flat]
>> flat_networks = provider
>> [ml2_type_geneve]
>> vni_ranges = 1:65536
>> max_header_size = 38
>> [securitygroup]
>> enable_security_group = True
>> firewall_driver = 
>> neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
>> [ovn]
>> ovn_nb_connection = tcp:{{controllerip}}:6641
>> ovn_sb_connection = tcp:{{controllerip}}:6642
>> ovn_l3_scheduler = leastloaded
>> ovn_metadata_enabled = True
>> enable_distributed_floating_ip = true
>>
>>
>>
>>
>> -- 
>> Gary Molenkamp			Science Technology Services
>> Systems Administrator		University of Western Ontario
>> molenkam at uwo.ca                  http://sts.sci.uwo.ca
>> (519) 661-2111 x86882		(519) 661-3566
>
> -- 
> Gary Molenkamp			Science Technology Services
> Systems Engineer		University of Western Ontario
> molenkam at uwo.ca                  http://sts.sci.uwo.ca
> (519) 661-2111 x86882		(519) 661-3566

-- 
Gary Molenkamp			Science Technology Services
Systems Engineer		University of Western Ontario
molenkam at uwo.ca                  http://sts.sci.uwo.ca
(519) 661-2111 x86882		(519) 661-3566
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230712/229bce6e/attachment-0001.htm>


More information about the openstack-discuss mailing list