Hello,
I appreciate the prompt feedback. Unfortunately, after making multiple changes, we cannot make external networks to connect via gateway-hosts. Our follow up investigation has shown the following:
1. Removed flat and vlan provider_networks from /etc/openstack_deploy/openstack_user_config.yml. Only management provider_networks was defined here :
provider_networks: - network: container_bridge: "br-mgmt" container_type: "veth" container_interface: "ens4" ip_from_q: "management" type: "raw" group_binds: - all_containers - hosts is_management_address: true
2. Defined ML2 information and network types in /etc/openstack_deploy/user_variables.yml:
neutron_ml2_conf_ini_overrides: ml2: tenant_network_types: geneve ml2_type_flat: flat_networks: flat ml2_type_geneve: vni_ranges: 1:1000 max_header_size: 38
3. Moved neutron_provider_networks configuration on a per-host basis and removed network_mappings and network_interface_mappings for compute hosts in /etc/openstack_deploy/host_vars/
compute node /etc/openstack_deploy/host_vars/cmp3:
neutron_provider_networks: network_types: "geneve" network_geneve_ranges: "1:1000"
gateway node /etc/openstack_deploy/host_vars/net1:
neutron_provider_networks: network_types: "geneve" network_geneve_ranges: "1:1000" network_mappings: "flat:br-flat" network_interface_mappings: "br-flat:ens2"
4. Upon checking the new recreated inventory targets the correct neutron_ovn_gateway hosts /etc/openstack_deploy/openstack_inventory.json
… "component": "neutron_ovn_gateway", "container_name": "net1", "container_networks": { "management_address": { "address": "172.16.0.31", "bridge": "br-mgmt",
--
"component": "neutron_ovn_gateway", "container_name": "net2", "container_networks": { "management_address": { "address": "172.16.0.32", "bridge": "br-mgmt",
-- "neutron_ovn_gateway": { "children": [], "hosts": [ "net1", "net2" …
5. The correct ovn-cms-options=enable-chassis-as-gw is set on gateway nodes only:
ovn-sbctl list chassis | grep 'hostname|ovn-cms-options'
hostname : net2 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : net1 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : cmp3 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : cmp4 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
RESULT: VMs fail to launch with external network (flat). Error logs "Binding failed for port":
Sep 6 19:42:37 net1 nova-conductor[4270]: 2023-09-06 19:42:37.599 4270 ERROR nova.scheduler.utils [None req-25a6a8d6-8122-4621-a2c2-8ca0be5e594c 52059c7247434072b6823d1701fec23e 116579f970b242b996ac717fa7580311 - - default default] [instance: 8760706e-d38f-454d-b90f-b9d5d322ba99] Error from last host: dev-usc1-ost-cmp4 (node dev-usc1-ost-cmp4.openstack.local): ['Traceback (most recent call last):\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2607, in _build_and_run_instance\n self.driver.spawn(context, instance, image_meta,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 4383, in spawn\n xml = self._get_guest_xml(context, instance, network_info,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 7516, in _get_guest_xml\n network_info_str = str(network_info)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 620, in __str__\n return self._sync_wrapper(fn, *args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 603, in _sync_wrapper\n self.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 635, in wait\n self[:] = self._gt.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 181, in wait\n return self._exit_event.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/event.py", line 132, in wait\n current.throw(*self._exc)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 221, in main\n result = function(*args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/utils.py", line 654, in context_wrapper\n return func(*args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1987, in _allocate_network_async\n raise e\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1965, in _allocate_network_async\n nwinfo = self.network_api.allocate_for_instance(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1216, in allocate_for_instance\n created_port_ids = self._update_ports_for_instance(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1352, in _update_ports_for_instance\n with excutils.save_and_reraise_exception():\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__\n self.force_reraise()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise\n raise self.value\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1327, in _update_ports_for_instance\n updated_port = self._update_port(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 585, in _update_port\n _ensure_no_port_binding_failure(port)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 294, in _ensure_no_port_binding_failure\n raise exception.PortBindingFailed(port_id=port['id'])\n', 'nova.exception.PortBindingFailed: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2428, in _do_build_and_run_instance\n self._build_and_run_instance(context, instance, image,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2703, in _build_and_run_instance\n raise exception.RescheduledException(\n', 'nova.exception.RescheduledException: Build of instance 8760706e-d38f-454d-b90f-b9d5d322ba99 was re-scheduled: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n']
All we need is to make sure external networks are routed via gateway-hosts and not via compute nodes. In our case, compute nodes have only one physical interface with an IP address and no connectivity to the flat network. No layer 2 connectivity is available on compute nodes either. That's the reason why we must traverse external traffic via gateway nodes only.
It is worth noting that tenant/internal networks work fine.
What are we doing wrong?
Thank you.