[ovn-bgp-agent][neutron] - expose_tenant_networks bug

Satish Patel satish.txt at gmail.com
Wed Aug 31 19:09:20 UTC 2022


Hi Luis,

Here are the requested things which you asked for.

### Versions

pyroute2        =       0.7.2
openvswitch-switch  =  2.17.0-0ubuntu1~cloud0
ovn    = 22.03.0-0ubuntu1~cloud0
devstack master branch

### Rack-1-host-2

vagrant at rack-1-host-2:~$ ip rule
0: from all lookup local
1000: from all lookup [l3mdev-table]
32000: from all to 10.0.0.1/26 lookup br-ex
32000: from all to 172.16.1.144 lookup br-ex
32000: from all to 172.16.1.148 lookup br-ex
32766: from all lookup main
32767: from all lookup default


vagrant at rack-1-host-2:~$ ip route show table br-ex
default dev br-ex scope link
10.0.0.0/26 via 172.16.1.144 dev br-ex
172.16.1.144 dev br-ex scope link
172.16.1.148 dev br-ex scope link

### Rack-2-host-1

vagrant at rack-2-host-1:~$ ip rule
0: from all lookup local
1000: from all lookup [l3mdev-table]
32000: from all to 172.16.1.143 lookup br-ex
32766: from all lookup main
32767: from all lookup default


vagrant at rack-2-host-1:~$ ip route show table br-ex
default dev br-ex scope link
172.16.1.143 dev br-ex scope link


#### I have quickly cloned the latest branch of ovn-bgp-agent and ran and
found the following error. Assuming your patch is part of that master
branch.

rack-1-host-2: https://paste.opendev.org/show/bWbhmbzbi8YHGZsbhUAb/


Notes: This is bug or something else -
https://opendev.org/x/ovn-bgp-agent/src/branch/master/ovn_bgp_agent/privileged/vtysh.py#L27

I have to replace the above Line:27 code of vtysh to the following to fix
the vtysh error.

@ovn_bgp_agent.privileged.vtysh_cmd.entrypoint

def run_vtysh_config(frr_config_file):

    vtysh_command = "copy {} running-config".format(frr_config_file)

    full_args = ['/usr/bin/vtysh', '--vty_socket',
constants.FRR_SOCKET_PATH, 'c']

    full_args.extend(vtysh_command.split(' '))

On Wed, Aug 31, 2022 at 3:51 AM Luis Tomas Bolivar <ltomasbo at redhat.com>
wrote:

>
>
> On Wed, Aug 31, 2022 at 9:12 AM Luis Tomas Bolivar <ltomasbo at redhat.com>
> wrote:
>
>> See below
>>
>>
>> On Tue, Aug 30, 2022 at 10:14 PM Satish Patel <satish.txt at gmail.com>
>> wrote:
>>
>>> Hi Luis,
>>>
>>> I have redeploy my lab and i have following components
>>>
>>> rack-1-host-1 - controller
>>> rack-1-host-2 - compute1
>>> rack-2-host-1 - compute2
>>>
>>>
>>> # I am running ovn-bgp-agent on only two compute nodes compute1 and
>>> compute2
>>> [DEFAULT]
>>> debug=False
>>> expose_tenant_networks=True
>>> driver=ovn_bgp_driver
>>> reconcile_interval=120
>>> ovsdb_connection=unix:/var/run/openvswitch/db.sock
>>>
>>> ### without any VM at present i can see only router gateway IP on
>>> rack1-host-2
>>>
>>
>> Yep, this is what is expected at this point.
>>
>>
>>>
>>> vagrant at rack-1-host-2:~$ ip a show ovn
>>> 37: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 0a:f7:6e:e0:19:69 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.144/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8f7:6eff:fee0:1969/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> vagrant at rack-2-host-1:~$ ip a show ovn
>>> 15: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 56:61:6b:29:ac:29 brd ff:ff:ff:ff:ff:ff
>>>     inet6 fe80::5461:6bff:fe29:ac29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> ### Lets create vm1 which is endup on rack1-host-2 but it didn't expose
>>> vm1 ip (tenant ip) same with rack-2-host-1
>>>
>>> vagrant at rack-1-host-2:~$ ip a show ovn
>>> 37: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 0a:f7:6e:e0:19:69 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.144/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8f7:6eff:fee0:1969/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>
>> It should be exposed here, what about the output of "ip rule" and "ip
>> route show table br-ex"?
>>
>>
>>>
>>> vagrant at rack-2-host-1:~$ ip a show ovn
>>> 15: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 56:61:6b:29:ac:29 brd ff:ff:ff:ff:ff:ff
>>>     inet6 fe80::5461:6bff:fe29:ac29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> ### Lets attach a floating ip to vm1 and see. now i can see 10.0.0.17
>>> vm1 ip got expose on rack-1-host-2 same time nothing on rack-2-host-1 ( ofc
>>> because no vm running on it)
>>>
>>> vagrant at rack-1-host-2:~$ ip a show ovn
>>> 37: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 0a:f7:6e:e0:19:69 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.144/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 10.0.0.17/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 172.16.1.148/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8f7:6eff:fee0:1969/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>
>> There is also a resync action happening every 120 seconds... Perhaps for
>> some reason the initial addition of 10.0.0.17 failed and then the sync
>> discovered it and added it (and it matched with the time you added the FIP
>> more or less).
>>
>> But events are managed one by one and those 2 are different, so adding
>> the FIP is not adding the internal IP. It was probably a sync action.
>>
>>
>>>
>>> vagrant at rack-2-host-1:~$ ip a show ovn
>>> 15: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 56:61:6b:29:ac:29 brd ff:ff:ff:ff:ff:ff
>>>     inet6 fe80::5461:6bff:fe29:ac29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> #### Lets spin up vm2 which should end up on other compute node which is
>>> rack-2-host-1  ( no change yet.. vm2 ip wasn't exposed anywhere yet. )
>>>
>>> vagrant at rack-1-host-2:~$ ip a show ovn
>>> 37: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 0a:f7:6e:e0:19:69 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.144/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 10.0.0.17/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 172.16.1.148/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8f7:6eff:fee0:1969/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> vagrant at rack-2-host-1:~$ ip a show ovn
>>> 15: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 56:61:6b:29:ac:29 brd ff:ff:ff:ff:ff:ff
>>>     inet6 fe80::5461:6bff:fe29:ac29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> #### Lets again attach floating ip to vm2 ( so far nothing changed,
>>> technically it should expose IP on rack-1-host-2 )
>>>
>>> vagrant at rack-1-host-2:~$ ip a show ovn
>>> 37: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 0a:f7:6e:e0:19:69 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.144/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 10.0.0.17/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet 172.16.1.148/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8f7:6eff:fee0:1969/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>> The IP of the second VM should be exposed here ^, in rack-1-host-2,
>>> while the FIP in the other compute (rack-2-host-1)
>>>
>>
>>
>>> vagrant at rack-2-host-1:~$ ip a show ovn
>>> 15: ovn: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master
>>> ovn-bgp-vrf state UNKNOWN group default qlen 1000
>>>     link/ether 56:61:6b:29:ac:29 brd ff:ff:ff:ff:ff:ff
>>>     inet 172.16.1.143/32 scope global ovn
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::5461:6bff:fe29:ac29/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>>
>>> Here is the logs - https://paste.opendev.org/show/bRThivJE4wvEN92DXJUo/
>>>
>>
>> What node these logs belong to? rack-1-host-2?
>>
>> And are you running with the latest code? Looks the problem is on the
>> sync function when trying to ensure the routing table entry for br-ex. It
>> prints this:
>>
>> 2022-08-30 20:12:54.541 8318 DEBUG ovn_bgp_agent.utils.linux_net [-] Found routing table for br-ex with: ['200', 'br-ex']
>>
>> So definitely ovn_routing_tables should be initialized with {'br-ex':
>> 200}, so I don't really get where the KeyError comes from...
>>
>> Unless it is not accessing the dict, but accessing the ndb.routes...
>> perhaps with the pyroute2 version you have, the family parameter is needed
>> there. Let me send a patch that you can try with
>>
>
> This is the patch https://review.opendev.org/c/x/ovn-bgp-agent/+/855062.
> Give it a try and let me know if the error you are seeing in the logs goes
> away with it
>
>
>>
>>> On Thu, Aug 25, 2022 at 6:25 AM Luis Tomas Bolivar <ltomasbo at redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Aug 25, 2022 at 11:31 AM Satish Patel <satish.txt at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Luis,
>>>>>
>>>>> Very interesting, you are saying it will only expose tenant ip on
>>>>> gateway port node? Even we have DVR setup in cluster correct?
>>>>>
>>>>
>>>> Almost. The path is the same as in a DVR setup without BGP (with the
>>>> difference you can reach the internal IP). In a DVR setup, when the VM is
>>>> in a tenant network, without a FIP, the traffic goes out through the cr-lrp
>>>> (ovn router gateway port), i.e.,  the node hosting that port which is
>>>> connecting the router where the subnet where the VM is to the provider
>>>> network.
>>>>
>>>> Note this is a limitation due to how ovn is used in openstack neutron,
>>>> where traffic needs to be injected into OVN overlay in the node holding the
>>>> cr-lrp. We are investigating possible ways to overcome this limitation and
>>>> expose the IP right away in the node hosting the VM.
>>>>
>>>>
>>>>> Does gateway node going to expose ip for all other compute nodes?
>>>>>
>>>>
>>>>> What if I have multiple gateway node?
>>>>>
>>>>
>>>> No, each router connected to the provider network will have its own ovn
>>>> router gateway port, and that can be allocated in any node which has
>>>> "enable-chassis-as-gw". What is true is that all VMs in a tenant networks
>>>> connected to the same router, will be exposed in the same location .
>>>>
>>>>
>>>>> Did you configure that flag on all node or just gateway node?
>>>>>
>>>>
>>>> I usually deploy with 3 controllers which are also my "networker"
>>>> nodes, so those are the ones having the enable-chassis-as-gw flag.
>>>>
>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On Aug 25, 2022, at 4:14 AM, Luis Tomas Bolivar <ltomasbo at redhat.com>
>>>>> wrote:
>>>>>
>>>>> 
>>>>> I tested it locally and it is exposing the IP properly in the node
>>>>> where the ovn router gateway port is allocated. Could you double check if
>>>>> that is the case in your setup too?
>>>>>
>>>>> On Wed, Aug 24, 2022 at 8:58 AM Luis Tomas Bolivar <
>>>>> ltomasbo at redhat.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 23, 2022 at 6:04 PM Satish Patel <satish.txt at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Folks,
>>>>>>>
>>>>>>> I am setting up ovn-bgp-agent lab in "BGP mode" and i found
>>>>>>> everything working great except expose tenant network
>>>>>>> https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-testing-setup/
>>>>>>>
>>>>>>>
>>>>>>> Lab Summary:
>>>>>>>
>>>>>>> 1 controller node
>>>>>>> 3 compute node
>>>>>>>
>>>>>>> ovn-bgp-agent running on all compute node because i am using
>>>>>>> "enable_distributed_floating_ip=True"
>>>>>>>
>>>>>>
>>>>>>> ovn-bgp-agent config:
>>>>>>>
>>>>>>> [DEFAULT]
>>>>>>> debug=False
>>>>>>> expose_tenant_networks=True
>>>>>>> driver=ovn_bgp_driver
>>>>>>> reconcile_interval=120
>>>>>>> ovsdb_connection=unix:/var/run/openvswitch/db.sock
>>>>>>>
>>>>>>> I am not seeing my vm on tenant ip getting exposed but when i attach
>>>>>>> FIP which gets exposed in loopback address. here is the full trace of debug
>>>>>>> logs: https://paste.opendev.org/show/buHiJ90nFgC1JkQxZwVk/
>>>>>>>
>>>>>>
>>>>>> It is not exposed in any node, right? Note when expose_tenant_network
>>>>>> is enabled, the traffic to the tenant VM is exposed in the node holding the
>>>>>> cr-lrp (ovn router gateway port) for the router connecting the tenant
>>>>>> network to the provider one.
>>>>>>
>>>>>> The FIP will be exposed in the node where the VM is.
>>>>>>
>>>>>> On the other hand, the error you see there should not happen, so I'll
>>>>>> investigate why that is and also double check if the expose_tenant_network
>>>>>> flag is broken somehow.
>>>>>>
>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>> --
>>>>>> LUIS TOMÁS BOLÍVAR
>>>>>> Principal Software Engineer
>>>>>> Red Hat
>>>>>> Madrid, Spain
>>>>>> ltomasbo at redhat.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> LUIS TOMÁS BOLÍVAR
>>>>> Principal Software Engineer
>>>>> Red Hat
>>>>> Madrid, Spain
>>>>> ltomasbo at redhat.com
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> LUIS TOMÁS BOLÍVAR
>>>> Principal Software Engineer
>>>> Red Hat
>>>> Madrid, Spain
>>>> ltomasbo at redhat.com
>>>>
>>>>
>>>
>>
>> --
>> LUIS TOMÁS BOLÍVAR
>> Principal Software Engineer
>> Red Hat
>> Madrid, Spain
>> ltomasbo at redhat.com
>>
>>
>
>
> --
> LUIS TOMÁS BOLÍVAR
> Principal Software Engineer
> Red Hat
> Madrid, Spain
> ltomasbo at redhat.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20220831/01085742/attachment-0001.htm>


More information about the openstack-discuss mailing list