[centos][kolla-ansible][wallaby][vpnaas] I need Help

Björn Puttmann puttmann at neozo.de
Tue Aug 3 10:50:01 UTC 2021


Hi!

We had similar problems while testing VPNaaS with Victoria in HA mode.
The main takeaway was the following: 

- the path to the pidfile of the libreswan process in the driver is wrong.
  This causes the process.active check to always fail.

diff --git a/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py b/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
index 90731f7a4..48b178d05 100644
--- a/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
+++ b/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
@@ -30,6 +30,9 @@ class LibreSwanProcess(ipsec.OpenSwanProcess):
         self._rootwrap_cfg = self._get_rootwrap_config()
         super(LibreSwanProcess, self).__init__(conf, process_id,
                                               vpnservice, namespace)
+        self.pid_path = os.path.join(
+            self.config_dir, 'var', 'run', 'pluto', 'pluto')
+        self.pid_file = '%s.pid' % self.pid_path

     def _ipsec_execute(self, cmd, check_exit_code=True, extra_ok_codes=None):
         """Execute ipsec command on namespace.

- when deploying VPNaaS in HA mode, takeover fails as the standby node is missing 
  the necessary configuration files for Libreswan.

diff --git a/neutron_vpnaas/services/vpn/device_drivers/ipsec.py b/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
index 424c0eea0..7b1c96b3d 100644
--- a/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
+++ b/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
@@ -1077,12 +1077,19 @@ class IPsecDriver(device_drivers.DeviceDriver, metaclass=abc.ABCMeta):
     def report_status(self, context):
         status_changed_vpn_services = []
         for process_id, process in list(self.processes.items()):
-            # NOTE(mnaser): It's not necessary to check status for processes
-            #               of a backup L3 agent
             router = self.routers.get(process_id)
-            if router and router.router['ha'] and router.ha_state == 'backup':
-                LOG.debug("%s router in backup state, skipping", process_id)
-                continue
+            if router and router.router['ha']:
+                # NOTE(mnaser): It's not necessary to check status for processes
+                #               of a backup L3 agent
+                if router.ha_state == 'backup':
+                    LOG.debug("%s router in backup state, skipping", process_id)
+                    continue
+                # When a ha takeover took place, router will initially not be active. Enable it now.
+                if router.ha_state == 'primary' and not process.active:
+                    LOG.debug("%s is primary but not active. Activating.", process_id)
+                    process.ensure_configs()
+                    process.enable()
+                    continue
             if not self.should_be_reported(context, process):
                 continue
             previous_status = self.get_process_status_cache(process)

With these patches applied, VPNaaS worked as expected, at least in Victoria.

It seems, as Bodo already pointed out, that CentOS 8 is coming with Libreswan 4.4 which apparently dropped
the "—use-netkey" parameter and now expects "--use-xfrm"

So you could try to patch the corresponding driver files:

diff --git a/neutron_vpnaas/services/vpn/device_drivers/ipsec.py b/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
index 424c0eea0..051719d40 100644
--- a/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
+++ b/neutron_vpnaas/services/vpn/device_drivers/ipsec.py
@@ -647,7 +647,7 @@ class OpenSwanProcess(BaseSwanProcess):
                'pluto',
                '--ctlbase', self.pid_path,
                '--ipsecdir', self.etc_dir,
-               '--use-netkey',
+               '--use-xfrm',
                '--uniqueids',
                '--nat_traversal',
                '--secretsfile', self.secrets_file]

diff --git a/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py b/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
index 90731f7a4..b68e05f81 100644
--- a/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
+++ b/neutron_vpnaas/services/vpn/device_drivers/libreswan_ipsec.py
@@ -106,7 +106,7 @@ class LibreSwanProcess(ipsec.OpenSwanProcess):

     def start_pluto(self):
         cmd = ['pluto',
-               '--use-netkey',
+               '--use-xfrm',
                '--uniqueids']

         if self.conf.ipsec.enable_detailed_logging:

You could apply the patches to the device driver files on a local machine and copy the patched files into the corresponding 
containers via:

docker cp /path/to/patched/{{ driver_file }} neutron_l3_agent:/usr/lib/python3.6/site-packages/neutron_vpnaas/services/vpn/device_drivers/{{ driver_file }}

After that, restart the container.

At least, that worked for us.

Some commands, that proved quite useful for debugging the VPN stuff (please ignore if you already know this, just found this info very useful myself ;-)

Get host for VPN tunnel:

export PROJECT_ID=THE_PROJECT_ID
ROUTER_ID=$(openstack --os-cloud THE_PROJECT_NAME vpn service list --long -f json | jq -r ".[] | select(.Project == \"${PROJECT_ID}\").Router")
echo ${ROUTER_ID}
openstack --os-cloud THE_PROJECT_NAME port list --router ${ROUTER_ID} --device-owner network:ha_router_replicated_interface -c binding_host_id  -f value | sort -u

Enable detailed logging in /etc/kolla/neutron-l3-agent/l3_agent.ini (restart container after changing this):

[ipsec]
enable_detailed_logging = True

Logfile can now be found in neutron_l3_agent under /var/lib/neutron/ipsec/ROUTER_ID/log/

Get ipsec status (execute in neutron_l3_agent container):

export ROUTER_ID=THE_ROUTER_ID
ip netns exec qrouter-${ROUTER_ID} neutron-vpn-netns-wrapper --mount_paths="/etc:/var/lib/neutron/ipsec/${ROUTER_ID}/etc,/var/run:/var/lib/neutron/ipsec/${ROUTER_ID}/var/run" --rootwrap_config=/etc/neutron/rootwrap.conf --cmd="ipsec,whack,--status"

ipsec.conf for router can be found under /var/lib/neutron/ipsec/ROUTER_ID/etc/ipsec.conf

If you want to change the configuration directly (e.g. for testing cipher), the connection configuration needs to be reloaded:

export ROUTER_ID=THE_ROUTER_ID
ip netns exec qrouter-${ROUTER_ID} neutron-vpn-netns-wrapper --mount_paths="/etc:/var/lib/neutron/ipsec/${ROUTER_ID}/etc,/var/run:/var/lib/neutron/ipsec/${ROUTER_ID}/var/run" --rootwrap_config=/etc/neutron/rootwrap.conf --cmd="ipsec,auto,--replace,CONNECTION_ID"

CONNECTION_ID can be found in ipsec.conf, e.g. 

conn 2b965b6b-2e03-4a89-acff-74cb56335f22

To initiate the tunnel after configuration change:

export ROUTER_ID=THE_ROUTER_ID
ip netns exec qrouter-${ROUTER_ID} neutron-vpn-netns-wrapper --mount_paths="/etc:/var/lib/neutron/ipsec/${ROUTER_ID}/etc,/var/run:/var/lib/neutron/ipsec/${ROUTER_ID}/var/run" --rootwrap_config=/etc/neutron/rootwrap.conf --cmd="ipsec,whack,--initiate,--name,CONNECTION_ID"

To reread secrets:

export ROUTER_ID=THE_ROUTER_ID
ip netns exec qrouter-${ROUTER_ID} neutron-vpn-netns-wrapper --mount_paths="/etc:/var/lib/neutron/ipsec/${ROUTER_ID}/etc,/var/run:/var/lib/neutron/ipsec/${ROUTER_ID}/var/run" --rootwrap_config=/etc/neutron/rootwrap.conf --cmd="ipsec,auto,--rereadsecrets"

Hope this is of some use.

All the best and take care,
Björn Puttmann
--
NEOZO Cloud GmbH






More information about the openstack-discuss mailing list