[xena][neutron][ovn] Follow up to BGP with OVN PTG discussions
Thank you all who attended the discussions at the Xena PTG regarding BGP dynamic routing with Neutron using OVN. Here's a brief summary of the important points covered, and some background information. Red Hat has begun gathering a team of engineers to add OpenStack support for BGP dynamic routing using the Free Range Routing (FRR) set of daemons. Acting as a technical lead for the project, I led one session in the TripleO room to discuss the installer components and two sessions in the Neutron room to discuss BGP routing with OVN, and BGP EVPN with OVN. There was some feedback during the Neutron sessions that little has been done to engage the greater OpenStack/Neutron community thus far, or to utilize the existing RFE process for Neutron. This feedback was correct and was received. Initial work was done with a design that didn't require changes on Neutron/core OVN to start with. The intention is to create Neutron RFEs and to work with others working along similar lines now that we have some more clarity as to the direction of further efforts. There will likely be opportunities to leverage APIs and contribute to existing work being done with Neutron Dynamic Routing, BGPVPN, and other work being done to implement BGP EVPN. We would like to collaborate with Ericsson and others and come up with a solution that fits us all! The first steps involved greater changes to the deployment tooling, and this was proposed and reviewed upstream in the TripleO project: _ https://specs.openstack.org/openstack/tripleo-specs/specs/wallaby/triplo-bgp... There are several use cases for using BGP, and in fact there are separate efforts underway to utilize BGP for the control plane and data plane. BGP may be used for equal-cost multipath (ECMP) load balancing of outbound links, and bi-directional forwarding detection (BFD) for resiliency to ensure that a path provides connectivity. For outbound connectivity BGP will learn routes from BGP peers. There is separate work being done to add BFD support for Neutron and Neutron Dynamic Routing, but this does not provide BFD support for host communication at the hypervisor layer. Using FRR at the host level provides this BFD support. BGP may be used for advertising routes to API endpoints. In this model HAProxy will listen on an IP address and FRR will advertise routes to that IP to BGP peers. High availability for HAProxy is provided via other means such as Pacemaker, and FRR will simply advertise the virtual IP address when it is active on an API controller. BGP may also be used for routing inbound traffic to provider network IPs or floating IPs for instance connectivity. The Compute nodes will run FRR to advertise routes to the local VM IPs or floating IPs hosted on the node. FRR has a daemon named Zebra that is responsible for exchanging routes between routing daemons such as BGP and the kernel. The redistribute connected statement in the FRR configuration will cause local IP addresses on the host to be advertised via BGP. Floating IP addresses are attached to a loopback interface in a namespace, so they will be redistributed using this method. Changes to OVN will be required to ensure provider network IPs assigned to VMs will be assigned to a loopback interface in a namespace in a similar fashion. FRR was selected for integration into TripleO and OVN for several reasons: using FRR leverages a proven production-grade routing solution that gives us BGP, BFD (bi-directional forwarding detection), VRFs for different namespaces for multitenancy, integration with kernel routing, and potentially other features such as OSPF, RPKI, route monitoring/mirroring, and more. FRR has a very complete feature set and is very robust, although there are other BGP speakers available such as ExaBGP or BIRD, and os-ken, with varying feature sets. OVN will need to be modified to enable the Compute node to assign VM provider network IPs to a loopback interface inside a namespace. These IP address will not be used for sending or receiving traffic, only for redistributing routes to the IPs to BGP peers. Traffic which is sent to those IP addresses will be forwarded to the VM using OVS flows on the hypervisor. An example agent for OVN has been written to demonstrate how to monitor the southbound OVN DB and create loopback IP addresses when a VM is started on a Compute node. The OVN changes will be detailed in a separate OVN spec. Demonstration code is available on Github: _ https://github.com/luis5tb/bgp-agent BGP EVPN with multitenancy will require separate VRFs per tenant. This will allow separate routing tables to be maintained, and allow for overlapping IP addresses for different Neutron tenant networks. FRR may have the capability to utilize a single BGP peering session to combine advertisements for all these VRFs, but there is still work to be done to prototype this design. This may result in more efficient BGP dynamic updates, and could potentially make troubleshooting more straightforward. As suggested in the PTG discussions, we are investigating the BGPVPN API. It appears that this API will work well for this use case. Hopefully we can make significant progress during the Xena development cycle, and we will be able to define what needs to be done in subsequent cycles. Any thoughts, suggestions, and contributions are appreciated. If anyone would like to review the work that we've already published, there is a series of blog posts that Luis Tomas Bolivar made related to how to use it on OpenStack and how it works: - OVN-BGP agent introduction: https://ltomasbo.wordpress.com/2021/02/04/openstack-networking-with-bgp/ - How to set ip up on DevStack Environment: https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-testing-setup/ - In-depth traffic flow inspection: https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-in-depth-traffic-flo... Here are some relevant links to posts written by Luis Tomas Bolivar on the ovs-discuss mailing list: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-March/051029.html https://mail.openvswitch.org/pipermail/ovs-discuss/2021-March/051031.html https://mail.openvswitch.org/pipermail/ovs-discuss/2021-March/051033.html -- Dan Sneddon | Senior Principal Software Engineer dsneddon@redhat.com | redhat.com/cloud dsneddon:irc | @dxs:twitter
Hi Dan,
Red Hat has begun gathering a team of engineers to add OpenStack support for BGP dynamic routing using the Free Range Routing (FRR) set of daemons. Acting as a technical lead for the project, I led one session in the TripleO room to discuss the installer components and two sessions in the Neutron room to discuss BGP routing with OVN, and BGP EVPN with OVN.
There may be quite a lot of overlap with what we (Ericsson) are working on right now, we would be really interested in your long term vision and also the details of your plans.
There will likely be opportunities to leverage APIs and contribute to existing work being done with Neutron Dynamic Routing, BGPVPN, and other work being done to implement BGP EVPN. We would like to collaborate with Ericsson and others and come up with a solution that fits us all!
There are a few related specs proposed already. Below are two links that may be the most relevant to you. All your input is welcome. BFD: https://review.opendev.org/c/openstack/neutron-specs/+/767337 BGP: https://review.opendev.org/c/openstack/neutron-specs/+/783791
BGP may be used for equal-cost multipath (ECMP) load balancing of outbound links, and bi-directional forwarding detection (BFD) for resiliency to ensure that a path provides connectivity.
BGP may also be used for routing inbound traffic to provider network IPs or floating IPs for instance connectivity.
I believe we also share these two use cases - with some caveats, please see below.
The Compute nodes will run FRR to advertise routes to the local VM IPs or floating IPs hosted on the node. FRR has a daemon named Zebra that is responsible for exchanging routes between routing daemons such as BGP and the kernel. The redistribute connected statement in the FRR configuration will cause local IP addresses on the host to be advertised via BGP. Floating IP addresses are attached to a loopback interface in a namespace, so they will be redistributed using this method. Changes to OVN will be required to ensure provider network IPs assigned to VMs will be assigned to a loopback interface in a namespace in a similar fashion.
Am I getting it right that your primary objective is to route the traffic directly to the hypervisors and there hoist it to the tunnel networks? Some of the links in your email also gave me the impression that occasionaly you'd want to route the traffic to a neutron router's gateway port. Is that right? In which cases? Currently neutron-dynamic-routing advertises routes with their nexthop being the router's gw port. We have a use case for arbitrary VM ports being the nexthop. And you seem to have a use case for the hypervisor being the nexthop. Maybe we could come up with an extension of the n-d-r API that can express these variations... Similar thoughts could be applied to the BFD proposal too.
In the further development of the proof-of-concept, how much do you plan to make this API driven? The PoC seems to be reacting to port binding events, but most other information (peers, filters, maybe nexthops) seem to be coming from TripleO deployed configuration and not from the API. How would you like this to look in the long term?
BGP EVPN with multitenancy will require separate VRFs per tenant. This will allow separate routing tables to be maintained, and allow for overlapping IP addresses for different Neutron tenant networks. FRR may have the capability to utilize a single BGP peering session to combine advertisements for all these VRFs, but there is still work to be done to prototype this design. This may result in more efficient BGP dynamic updates, and could potentially make troubleshooting more straightforward.
BGP to API endpoints and BGPVPN related things are not on our plate right now. However support in Neutron for VRFs could be interesting to us too. Thanks for the great writeup! Cheers, Bence Romsics irc: rubasov Ericsson Software Technology
participants (2)
-
Bence Romsics
-
Dan Sneddon