[OpenStack Ansible] [Deployment] Networking design
Hello, I am trying to build a minimum usable production-level cloud with OpenStack Ansible following online reference. The idea it shoud be able to scale up without radical change afterwards. Online reference: https://docs.openstack.org/project-deploy-guide/openstack-ansible/2025.1/ openstack-ansible version: stable/2025.1 Target hosts (Ubuntu 24.04LTS): 1) Infra1: Controller 1 + networking 2) Infra2: Controller 2 3) Compute 1: Compute 1 + storage 1 4) Compute 2: Compute 2 + storage 2 Deploy host: dev-host (Ubuntu 24.04LTS) Network: Each sever has two network adapter, say enp11s0f0 and enp11s0f1 (different server may have different port name, en0 and en1 for short). Public IP (en1) : Infra1 - 10.2.46.70/24, fixed IP permanently assigned by provider. Infra2, Compute 1, Compute 2 - 10.2.46.XX/24, dynamic IP temporary assigned for SW installation Private IP (en0): Infra1: 172.29.236.11/22, VLAN 10 172.29.240.11/22, VLAN 30 Infra2: 172.29.236.12/22, VLAN 10 172.29.240.12/22, VLAN 30 Compute 1: 172.29.236.13/22, VLAN 10 172.29.240.13/22, VLAN 30 172.29.244.13/22, VLAN 20 Compute 2: 172.29.236.14/22, VLAN 10 172.29.240.14/22, VLAN 30 172.29.244.14/22, VLAN 20 VLAN 10 - management network VLAN 30 - tunnel network VLAN 20 - storage network Target: Since only Infra1:en1 is available with fixed public IP, It would be the only Internet connection without any port bonding. What have been done: 1) All ordinary IPs and VLANs setting (as above Network part) - The settings are verified and working (verified by ping and curl) 2) user_variables.yml haproxy_keepalived_external_vip_cidr: "10.2.46.70/24" haproxy_keepalived_internal_vip_cidr: "172.29.236.11/22" haproxy_keepalived_external_interface: enp11s0f1 haproxy_keepalived_internal_interface: br-mgmt 3) openstack_user_config.yml internal_lb_vip_address: 172.29.236.11 external_lb_vip_address: 10.2.46.70 Based on experiences as network engineer previously and a lot of readings from books and online guidance esp. the OpenStack Ansible deployment reference, I understand there should/could be something extra to be done. Can anybody please give me a clue on what else to be done or it is enough to make it working? Sorry for disturbing if any but netwoking seems to be the most complicated part for OpenStack deployment. The online examples are hard to be comprehend and customized. One small example: Ubuntu seems to be promoting netplan, while the examples are all with /etc/network/interfaces file (difficult to adapte with Ubuntu24.04). Thanks a lot for help in advance!
Hey, I think we should be actually having some netplan examples as well: https://opendev.org/openstack/openstack-ansible/src/branch/master/etc/netpla... Moreover, you can try to leverage our systemd-networkd role (networkd is used as a backed by netplan anyway, so they don't conflict) and some reference can be seen here: https://docs.openstack.org/openstack-ansible/latest/user/network-arch/exampl... One thing I've spotted right away, is that generally you should be using /32 for haproxy_keepalived, as they are gonna be added as aliases to "main" interfaces. Another one, is that you'd better use an FQDN for internal_lb_vip_address/external_lb_vip_address and then defined haproxy_bind_internal_lb_vip_address/haproxy_bind_external_lb_vip_address to these IP addresses. So the main question I have regarding networking is actually the single host having access to the "uplink". As then, I assume, only this one host should be acting as HAProxy host (as others won't have external VIP on them working). Also, by default, we do not install keepalived in case there's a single host in haproxy group, and it makes limited sense. You can override it by setting `haproxy_use_keepalived: true` in user_variables. But then - what are your expectations on VMs accessing the internet? Is it planned for them to access the world through Infra1 via geneve (tunnel) networks? As in general it is expected to have some kind of public subnet, from which IPs can be allocated, even though traffic will be SRC/DST nat-ed through "net" node. In other words, you can have that traffic flow, but then for "public" network in Neutron you'd need some tagged vlan with a public subnet, to be used by L3 routers in SDN. You can do some nasty hooks, like we do in AIO, where the public network is "fake" and another SRC NAT is happening through the node default route, but it's not how you should build production setup. пн, 6 окт. 2025 г. в 04:37, <holywine@outlook.com>:
Hello,
I am trying to build a minimum usable production-level cloud with OpenStack Ansible following online reference. The idea it shoud be able to scale up without radical change afterwards.
Online reference: https://docs.openstack.org/project-deploy-guide/openstack-ansible/2025.1/ openstack-ansible <https://docs.openstack.org/project-deploy-guide/openstack-ansible/2025.1/openstack-ansible> version: stable/2025.1
Target hosts (Ubuntu 24.04LTS): 1) Infra1: Controller 1 + networking 2) Infra2: Controller 2 3) Compute 1: Compute 1 + storage 1 4) Compute 2: Compute 2 + storage 2
Deploy host: dev-host (Ubuntu 24.04LTS)
Network: Each sever has two network adapter, say enp11s0f0 and enp11s0f1 (different server may have different port name, en0 and en1 for short).
Public IP (en1) : Infra1 - 10.2.46.70/24, fixed IP permanently assigned by provider. Infra2, Compute 1, Compute 2 - 10.2.46.XX/24, dynamic IP temporary assigned for SW installation
Private IP (en0): Infra1: 172.29.236.11/22, VLAN 10 172.29.240.11/22, VLAN 30 Infra2: 172.29.236.12/22, VLAN 10 172.29.240.12/22, VLAN 30 Compute 1: 172.29.236.13/22, VLAN 10 172.29.240.13/22, VLAN 30 172.29.244.13/22, VLAN 20 Compute 2: 172.29.236.14/22, VLAN 10 172.29.240.14/22, VLAN 30 172.29.244.14/22, VLAN 20
VLAN 10 - management network VLAN 30 - tunnel network VLAN 20 - storage network
Target: Since only Infra1:en1 is available with fixed public IP, It would be the only Internet connection without any port bonding.
What have been done: 1) All ordinary IPs and VLANs setting (as above Network part) - The settings are verified and working (verified by ping and curl)
2) user_variables.yml haproxy_keepalived_external_vip_cidr: "10.2.46.70/24" haproxy_keepalived_internal_vip_cidr: "172.29.236.11/22" haproxy_keepalived_external_interface: enp11s0f1 haproxy_keepalived_internal_interface: br-mgmt
3) openstack_user_config.yml internal_lb_vip_address: 172.29.236.11 external_lb_vip_address: 10.2.46.70
Based on experiences as network engineer previously and a lot of readings from books and online guidance esp. the OpenStack Ansible deployment reference, I understand there should/could be something extra to be done.
Can anybody please give me a clue on what else to be done or it is enough to make it working?
Sorry for disturbing if any but netwoking seems to be the most complicated part for OpenStack deployment. The online examples are hard to be comprehend and customized. One small example: Ubuntu seems to be promoting netplan, while the examples are all with /etc/network/interfaces file (difficult to adapte with Ubuntu24.04).
Thanks a lot for help in advance!
Dear Dmitriy, Thanks a lot for informative instruction! Still need to spend some time to take in... In the mean time, would it be possible that you take a look at below link which is the rough idea of networking? (https://github.com/holywi/git-tutorial/blob/master/mmexport1759995429620.png) 1. As Infra1:enp11s0f1 is the only available port to Internet we will take the single port as the exit of the system no matter the setting is conventionally for production or for testing. 2. The network setting of Infra1, somewhere around Networking L2/L3 Agents, is the most confusing part for me. Can't we just use something like plain NAT? Anyway I totally have no idea on the setting. It would be favorite just to make it simple and straightforward. 3. Not sure if it is reasonble to attach br-vxlan to enp11s0f0, or is it a must to attach it to enp11s0f1 as in referenced link that you provide? Thanks again! Best Regards
Hey, 1. "As Infra1:enp11s0f1 is the only available port to Internet" -> I am even more confused now. So on your graph above I do see that computes are also connected to the same public network as infra1. So is the infra1 being the only host with access to it or not? 2. "Can't we just use something like plain NAT" -> so Floating IPs are implemented through SRC-DST (one-to-one) NAT. There is also an SRC NAT implemented on Logical Routers (L3 agents depending on the driver). So task of Logical Router is actually to accomplish this NAT (and some routing), as it's acting as a gateway for internal (geneve/vxlan) networks from tenant side, while each Logical Router also needs to have a connection and an IP address from the public network to do the NAT. 3. So `br-vxlan` we are using in documentation more as a "common approach" thing. It does not have to be a separate bridge. For vxlan/geneve networks to work, it's enough to have just an interface with an IP address from the "tunnel" network on it. So indeed you can simplify it by dropping br-vxlan itself and just leaving enp11s0f0.30 It feels I also need to add some clarity on "modern" setup with OVN, as I see you're using a lot of OVS/LXB terminology which can be confusing. So I will quickly describe the differences below. - Since 2023.1 (Antelope) we are using ml2.ovn driver for Neutron by default, instead of ml2.lxb (ml2.ovs is very alike to it as well) - OVN uses geneve protocol for encapsulation and implementation of tenant networks instead of vxlan. - Unlike to ovs/lxb drivers, which require L3 agents and spawn virtual routers in network namespaces, OVN does implement that with OpenFlow inside of OVS. So L3 Agent service is not used or deployed there. - OVN has finally a very good implementation for DVR, so you can serve FIPs from computes with VMs directly, without a need to have a dedicated network nodes. While you still can have them (and they're named as gateway nodes now), many smaller setups may benefit from not having them. One more thing I wanted to highlight, is that on your scheme I do not see a network, which will be used for access to the OpenStack APIs or Dashboard (Horizon/Skyline) from the "world". As usually we suggest having a separate "public" network for this purpose. As br-mgmt is mainly designed for internal usage by components. So ideally it should not be exposed or used for "external" connections. чт, 9 окт. 2025 г. в 09:44, <holywine@outlook.com>:
Dear Dmitriy,
Thanks a lot for informative instruction! Still need to spend some time to take in...
In the mean time, would it be possible that you take a look at below link which is the rough idea of networking? ( https://github.com/holywi/git-tutorial/blob/master/mmexport1759995429620.png )
1. As Infra1:enp11s0f1 is the only available port to Internet we will take the single port as the exit of the system no matter the setting is conventionally for production or for testing.
2. The network setting of Infra1, somewhere around Networking L2/L3 Agents, is the most confusing part for me. Can't we just use something like plain NAT? Anyway I totally have no idea on the setting. It would be favorite just to make it simple and straightforward.
3. Not sure if it is reasonble to attach br-vxlan to enp11s0f0, or is it a must to attach it to enp11s0f1 as in referenced link that you provide?
Thanks again!
Best Regards
Hi Dmitriy, Sincerely appreciate for the help! It clears out a lot of misunderstandings and confusions in my head. In the meanwhile to make life easier I have applied for each node a public IP. So now each host can have a standalone Internet exit. I have updated the design drawing here (https://github.com/holywi/git-tutorial/blob/master/architecture-1.png) It would be pleased if you can take a look and comment. I would assume to put in each compute node an OVN gateway is OK, right? Thanks!
participants (2)
-
Dmitriy Rabotyagov
-
holywine@outlook.com