Hello,

I am operating several OpenStack Ussuri clouds and we lately have random troubles with the metadata service. We run all our networks as provider networks. The metadata service works fine for months some hosts can't get their metadata with a 404 error

I restarted nova-api, neutron-dhcp-agent and neutron-metadata-agent on all controllers but we still experience 404 on random nodes:

ssh 10.60.51.130
$ ip r s
...
169.254.169.254 via 10.60.51.121 dev eth0
$ curl -s http://169.254.169.254
<html>
 <head>
  <title>404 Not Found</title>
 </head>
 <body>
  <h1>404 Not Found</h1>
  The resource could not be found.<br /><br />
 </body>
</html>

ssh 10.60.51.131
$ ip r s
...
169.254.169.254 via 10.60.51.121 dev eth0
$ curl -s http://169.254.169.254
{"uuid": "ee6e8dea-796b-4a56-bb1e-1c38f4ac9030", "meta": {"Created By": "terraform"}...}

$ grep "10.60.51.131" /var/log/neutron/neutron-metadata-agent.log
2023-09-22 12:41:10.687 21954 INFO eventlet.wsgi.server [-] 10.60.51.131,<local> "GET /openstack/latest/meta_data.json HTTP/1.1" status: 200  len: 2729 time: 0.3817778
$ grep "10.60.51.130" /var/log/neutron/neutron-metadata-agent.log
$ grep "10.60.51.130" /var/log/nova/nova-api.log # on all controllers
$ grep "10.60.51.131" /var/log/nova/nova-api.log # on all controllers
2023-09-22 12:41:10.682 57433 INFO nova.metadata.wsgi.server [req-1d222a75-ccb4-49bc-8b17-2e2e81dc5f9b - - - - -] 10.60.51.131,1.2.3.4 "GET /openstack/latest/meta_data.json HTTP/1.1" status: 200 len: 2710 time: 0.2587621

Can you help me to troubleshoot this behaviour for some hosts ?

Regards
--
Jérôme B