neutron-metadata-agent broken pipe

Albert Braden Albert.Braden at synopsys.com
Sat Dec 7 02:11:10 UTC 2019


As our production cluster grows in size we are starting to have trouble with neutron-metadata-agent. After restarting it is happy for a minute and then it complains "2019-12-06 17:54:24.615 664587 WARNING oslo_messaging._drivers.amqpdriver [-] Number of call queues is 11, greater than warning threshold: 10. There could be a leak. Increasing threshold to: 20"

It increases the threshold a couple of times and then after increasing to 40 we start to see errors:

2019-12-06 17:55:10.119 664578 INFO eventlet.wsgi.server [-] Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response
    write(b''.join(towrite))
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write
    wfile.flush()
  File "/usr/lib/python2.7/socket.py", line 307, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall
    tail = self.send(data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send
    return self._send_loop(self.fd.send, data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop
    return send_method(data, *args)
error: [Errno 32] Broken pipe

It looks like increasing the threshold to 40 fails because it keeps trying:

2019-12-06 17:55:17.452 664597 WARNING oslo_messaging._drivers.amqpdriver [-] Number of call queues is 21, greater than warning threshold: 20. There could be a leak. Increasing threshold to: 40

And the errors increase until the log is nothing but errors, and VMs fail to boot.

root at us01odc-p02-ctrl1:~# tail -f /var/log/neutron/neutron-metadata-agent.log
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall
    tail = self.send(data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send
    return self._send_loop(self.fd.send, data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop
    return send_method(data, *args)
error: [Errno 32] Broken pipe

2019-12-06 17:59:23.664 664597 INFO eventlet.wsgi.server [-] 10.195.73.174,<local> "GET /latest/meta-data/instance-id HTTP/1.0" status: 200  len: 0 time: 62.1517761
2019-12-06 17:59:23.756 664583 INFO eventlet.wsgi.server [-] Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response
    write(b''.join(towrite))
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write
    wfile.flush()
 File "/usr/lib/python2.7/socket.py", line 307, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall
    tail = self.send(data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send
    return self._send_loop(self.fd.send, data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop
    return send_method(data, *args)
error: [Errno 32] Broken pipe

2019-12-06 17:59:23.757 664583 INFO eventlet.wsgi.server [-] 10.195.65.69,<local> "GET /latest/meta-data/instance-id HTTP/1.0" status: 200  len: 0 time: 63.0419171
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent [-] Unexpected error.: MessagingTimeout: Timed out waiting for a reply to message ID 77551d4cf4394b7b9cdfad68c0be46e8
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent Traceback (most recent call last):
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 89, in __call__
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     instance_id, tenant_id = self._get_instance_and_tenant_id(req)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 162, in _get_instance_and_tenant_id
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     ports = self._get_ports(remote_address, network_id, router_id)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 155, in _get_ports
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     return self._get_ports_for_remote_address(remote_address, networks)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/common/cache_utils.py", line 116, in __call__
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     return self.func(target_self, *args, **kwargs)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 137, in _get_ports_for_remote_address
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     ip_address=remote_address)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 106, in _get_ports_from_server
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     return self.plugin_rpc.get_ports(self.context, filters)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 72, in get_ports
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     return cctxt.call(context, 'get_ports', filters=filters)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/common/rpc.py", line 173, in call
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     time.sleep(wait)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     self.force_reraise()
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     six.reraise(self.type_, self.value, self.tb)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/neutron/common/rpc.py", line 150, in call
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     return self._original_context.call(ctxt, method, **kwargs)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 179, in call
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     retry=self.retry)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 133, in _send
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     retry=retry)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     call_monitor_timeout, retry=retry)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 573, in _send
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     call_monitor_timeout)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in wait
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     message = self.waiters.get(msg_id, timeout=timeout)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in get
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent     'to message ID %s' % msg_id)
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent MessagingTimeout: Timed out waiting for a reply to message ID 77551d4cf4394b7b9cdfad68c0be46e8
2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent
2019-12-06 17:59:23.778 664595 INFO eventlet.wsgi.server [-] Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response
    write(b''.join(towrite))
  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write
    wfile.flush()
  File "/usr/lib/python2.7/socket.py", line 307, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall
    tail = self.send(data, flags)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send
    return self._send_loop(self.fd.send, data, flags)

What's causing this? Are we overloading RMQ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191207/a98cbd62/attachment-0001.html>


More information about the openstack-discuss mailing list