neutron-metadata-agent broken pipe
Albert Braden
Albert.Braden at synopsys.com
Tue Dec 10 19:03:50 UTC 2019
That patch didn't fix it. It looks like we have a different issue. Can anyone help?
From: Albert Braden <Albert.Braden at synopsys.com>
Sent: Tuesday, December 10, 2019 10:10 AM
To: openstack-discuss at lists.openstack.org
Subject: RE: neutron-metadata-agent broken pipe
It looks like we may be encountering this bug:
https://bugs.launchpad.net/neutron/+bug/1853071
I'm testing this patch now.
https://review.opendev.org/#/c/697405/
From: Albert Braden
Sent: Monday, December 9, 2019 1:11 PM
To: openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>
Subject: RE: neutron-metadata-agent broken pipe
>From my VM, if I try the metadata server repeatedly, it eventually gets a response:
albertb@<html><body><h1>503:~ $ curl http://169.254.169.254/2009-04-04/meta-data/instance-id<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_meta-2Ddata_instance-2Did&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=dAR8ULtj-DCnAa8gFz_kCfaFc-yZFf7iRxJcUNjHG6g&e=>
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
albertb@<html><body><h1>503:~ $ curl http://169.254.169.254/2009-04-04/meta-data/instance-id<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_meta-2Ddata_instance-2Did&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=dAR8ULtj-DCnAa8gFz_kCfaFc-yZFf7iRxJcUNjHG6g&e=>
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
albertb@<html><body><h1>503:~ $ curl http://169.254.169.254/2009-04-04/meta-data/instance-id<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_meta-2Ddata_instance-2Did&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=dAR8ULtj-DCnAa8gFz_kCfaFc-yZFf7iRxJcUNjHG6g&e=>
i-000017ccalbertb@<html><body><h1>503:~ $
Then I see this in neutron-metadata-agent.log:
2019-12-09 12:41:55.213 266833 INFO eventlet.wsgi.server [-] 10.195.72.75,<local> "GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 200 len: 146 time: 0.3922560
When it fails (503/504) nothing is logged by Neutron but I see haproxy logs:
Dec 9 13:06:14 us01odc-p02-ctrl3 haproxy-metadata-proxy-569e2c42-3935-48e9-ae41-074650e57b2b[267096]: 10.195.72.75:53266 [09/Dec/2019:13:06:14.475] listener listener/metadata 0/0/-1/-1/0 503 212 - - SC-- 58/58/57/57/3 0/0 "GET /2009-04-04/meta-data/instance-id HTTP/1.1"
Dec 9 13:06:59 us01odc-p02-ctrl3 haproxy-metadata-proxy-569e2c42-3935-48e9-ae41-074650e57b2b[267096]: 10.195.72.75:53268 [09/Dec/2019:13:06:27.067] listener listener/metadata 0/0/0/-1/32001 504 194 - - sH-- 90/90/89/89/0 0/0 "GET /2009-04-04/meta-data/instance-id HTTP/1.1"
The load on my controllers is around 15 but they have 48 CPU so 15 should be OK.
When I look at RMQ I see everything fine except for the q-plugin queue; it has 3 consumers but there are thousands of unacked messages and the number gradually increases. Could that be causing the neutron-metadata issue?
From: Albert Braden <Albert.Braden at synopsys.com<mailto:Albert.Braden at synopsys.com>>
Sent: Monday, December 9, 2019 12:20 PM
To: openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>
Subject: RE: neutron-metadata-agent broken pipe
When I try to build a VM I see this in the VM logs:
2019-12-09 20:02:21,396 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: bad status code [503]
2019-12-09 20:03:41,084 - util.py[WARNING]: Failed fetching userdata from url http://169.254.169.254/2009-04-04/user-data<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_user-2Ddata&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=JPv_swVY0usZe3pulTrue-YU9tSB9STeSOeM31D0XGs&e=>
2019-12-09 12:03:53,041 - util.py[WARNING]: Failed running /var/lib/cloud/scripts/per-boot/config_instance.sh [1]
2019-12-09 12:03:53,043 - cc_scripts_per_boot.py[WARNING]: Failed to run module scripts-per-boot (per-boot in /var/lib/cloud/scripts/per-boot)
This is the failing line from the script:
name=`curl -s http://169.254.169.254/2009-04-04/meta-data/hostname`<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_meta-2Ddata_hostname-2560&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=HM_BR79M_6H0rb1UCrVgXrNnbxSQCx20btqfOkCf6pI&e=>
When I try this from the VM I get this error:
albertb@<html><body><h1>503:~ $ curl -s http://169.254.169.254/2009-04-04/meta-data/hostname<https://urldefense.proofpoint.com/v2/url?u=http-3A__169.254.169.254_2009-2D04-2D04_meta-2Ddata_hostname&d=DwMFAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=RTlZ5Jyptp6PGLJVV7aNKTeilTRd8FEteQ8FQa0h-q8&s=LxPWgb-9sfuKnK1sfmxB5LoRjKcmJ7Hs97fUlcmlnK4&e=>
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
When I check neutron-metadata-agent.log for the time when the VM was failing I see the "broken pipe" errors:
2019-12-09 11:56:00.075 664593 INFO eventlet.wsgi.server [-] Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response
write(b''.join(towrite))
File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write
wfile.flush()
File "/usr/lib/python2.7/socket.py", line 307, in flush
self._sock.sendall(view[write_offset:write_offset+buffer_size])
File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall
tail = self.send(data, flags)
File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send
return self._send_loop(self.fd.send, data, flags)
File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop
return send_method(data, *args)
error: [Errno 32] Broken pipe
Why is my neutron-metadata server failing? Has anyone else seen this problem? We are running Rocky with about 200 hypervisors; it started after we added 100.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191210/905387ee/attachment.html>
More information about the openstack-discuss
mailing list