Hello,

I appreciate the prompt feedback. Unfortunately, after making multiple changes, we cannot make external networks to connect via gateway-hosts. Our follow up investigation has shown the following:

1. Removed flat and vlan provider_networks from /etc/openstack_deploy/openstack_user_config.yml. Only management  provider_networks  was defined here :

provider_networks:
    - network:
        container_bridge: "br-mgmt"
        container_type: "veth"
        container_interface: "ens4"
        ip_from_q: "management"
        type: "raw"
        group_binds:
          - all_containers
          - hosts
        is_management_address: true

 
2. Defined ML2 information and network types in /etc/openstack_deploy/user_variables.yml:

neutron_ml2_conf_ini_overrides:
  ml2:
    tenant_network_types: geneve
  ml2_type_flat:
    flat_networks: flat
  ml2_type_geneve:
    vni_ranges: 1:1000
    max_header_size: 38

 
3. Moved neutron_provider_networks configuration on a per-host basis and removed network_mappings and network_interface_mappings for compute hosts in /etc/openstack_deploy/host_vars/

compute node
/etc/openstack_deploy/host_vars/cmp3:

neutron_provider_networks:
  network_types: "geneve"
  network_geneve_ranges: "1:1000"


gateway node
/etc/openstack_deploy/host_vars/net1:

neutron_provider_networks:
  network_types: "geneve"
  network_geneve_ranges: "1:1000"
  network_mappings: "flat:br-flat"
  network_interface_mappings: "br-flat:ens2"
 

4. Upon checking the new recreated inventory targets the correct neutron_ovn_gateway hosts
/etc/openstack_deploy/openstack_inventory.json


"component": "neutron_ovn_gateway",
                "container_name": "net1",
                "container_networks": {
                    "management_address": {
                        "address": "172.16.0.31",
                        "bridge": "br-mgmt",

--

                "component": "neutron_ovn_gateway",
                "container_name": "net2",
                "container_networks": {
                    "management_address": {
                        "address": "172.16.0.32",
                        "bridge": "br-mgmt",

--
"neutron_ovn_gateway": {
        "children": [],
        "hosts": [
            "net1",
            "net2"


 

 
5. The correct ovn-cms-options=enable-chassis-as-gw is set on gateway nodes only:

ovn-sbctl list chassis | grep 'hostname\|ovn-cms-options'

hostname            : net2
other_config        : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}

hostname            : net1
other_config        : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}

hostname            : cmp3
other_config        : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}

hostname            : cmp4
other_config        : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}

 
RESULT: VMs fail to launch with external network (flat). Error logs "Binding failed for port":
 

Sep  6 19:42:37 net1 nova-conductor[4270]: 2023-09-06 19:42:37.599 4270 ERROR nova.scheduler.utils [None req-25a6a8d6-8122-4621-a2c2-8ca0be5e594c 52059c7247434072b6823d1701fec23e 116579f970b242b996ac717fa7580311 - - default default] [instance: 8760706e-d38f-454d-b90f-b9d5d322ba99] Error from last host: dev-usc1-ost-cmp4 (node dev-usc1-ost-cmp4.openstack.local): ['Traceback (most recent call last):\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2607, in _build_and_run_instance\n    self.driver.spawn(context, instance, image_meta,\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 4383, in spawn\n    xml = self._get_guest_xml(context, instance, network_info,\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 7516, in _get_guest_xml\n    network_info_str = str(network_info)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 620, in __str__\n    return self._sync_wrapper(fn, *args, **kwargs)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 603, in _sync_wrapper\n    self.wait()\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 635, in wait\n    self[:] = self._gt.wait()\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 181, in wait\n    return self._exit_event.wait()\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/event.py", line 132, in wait\n    current.throw(*self._exc)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 221, in main\n    result = function(*args, **kwargs)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/utils.py", line 654, in context_wrapper\n    return func(*args, **kwargs)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1987, in _allocate_network_async\n    raise e\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1965, in _allocate_network_async\n    nwinfo = self.network_api.allocate_for_instance(\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1216, in allocate_for_instance\n    created_port_ids = self._update_ports_for_instance(\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1352, in _update_ports_for_instance\n    with excutils.save_and_reraise_exception():\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__\n    self.force_reraise()\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise\n    raise self.value\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1327, in _update_ports_for_instance\n    updated_port = self._update_port(\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 585, in _update_port\n    _ensure_no_port_binding_failure(port)\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 294, in _ensure_no_port_binding_failure\n    raise exception.PortBindingFailed(port_id=port[\'id\'])\n', 'nova.exception.PortBindingFailed: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2428, in _do_build_and_run_instance\n    self._build_and_run_instance(context, instance, image,\n', '  File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2703, in _build_and_run_instance\n    raise exception.RescheduledException(\n', 'nova.exception.RescheduledException: Build of instance 8760706e-d38f-454d-b90f-b9d5d322ba99 was re-scheduled: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n']


All we need is to make sure external networks are routed via gateway-hosts and not via compute nodes. In our case, compute nodes have only one physical interface with an IP address and no connectivity to the flat network. No layer 2 connectivity is available on compute nodes either. That's the reason why we must traverse external traffic via gateway nodes only.

It is worth noting that tenant/internal networks work fine.

What are we doing wrong?

Thank you.