<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal">As our production cluster grows in size we are starting to have trouble with neutron-metadata-agent. After restarting it is happy for a minute and then it complains “2019-12-06 17:54:24.615 664587 WARNING oslo_messaging._drivers.amqpdriver
[-] Number of call queues is 11, greater than warning threshold: 10. There could be a leak. Increasing threshold to: 20”<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It increases the threshold a couple of times and then after increasing to 40 we start to see errors:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">2019-12-06 17:55:10.119 664578 INFO eventlet.wsgi.server [-] Traceback (most recent call last):<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response<o:p></o:p></p>
<p class="MsoNormal"> write(b''.join(towrite))<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write<o:p></o:p></p>
<p class="MsoNormal"> wfile.flush()<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/socket.py", line 307, in flush<o:p></o:p></p>
<p class="MsoNormal"> self._sock.sendall(view[write_offset:write_offset+buffer_size])<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall<o:p></o:p></p>
<p class="MsoNormal"> tail = self.send(data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send<o:p></o:p></p>
<p class="MsoNormal"> return self._send_loop(self.fd.send, data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop<o:p></o:p></p>
<p class="MsoNormal"> return send_method(data, *args)<o:p></o:p></p>
<p class="MsoNormal">error: [Errno 32] Broken pipe<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It looks like increasing the threshold to 40 fails because it keeps trying:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">2019-12-06 17:55:17.452 664597 WARNING oslo_messaging._drivers.amqpdriver [-] Number of call queues is 21, greater than warning threshold: 20. There could be a leak. Increasing threshold to: 40<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">And the errors increase until the log is nothing but errors, and VMs fail to boot.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">root@us01odc-p02-ctrl1:~# tail -f /var/log/neutron/neutron-metadata-agent.log<o:p></o:p></p>
<p class="MsoNormal"> self._sock.sendall(view[write_offset:write_offset+buffer_size])<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall<o:p></o:p></p>
<p class="MsoNormal"> tail = self.send(data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send<o:p></o:p></p>
<p class="MsoNormal"> return self._send_loop(self.fd.send, data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop<o:p></o:p></p>
<p class="MsoNormal"> return send_method(data, *args)<o:p></o:p></p>
<p class="MsoNormal">error: [Errno 32] Broken pipe<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.664 664597 INFO eventlet.wsgi.server [-] 10.195.73.174,<local> "GET /latest/meta-data/instance-id HTTP/1.0" status: 200 len: 0 time: 62.1517761<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.756 664583 INFO eventlet.wsgi.server [-] Traceback (most recent call last):<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response<o:p></o:p></p>
<p class="MsoNormal"> write(b''.join(towrite))<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write<o:p></o:p></p>
<p class="MsoNormal"> wfile.flush()<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/socket.py", line 307, in flush<o:p></o:p></p>
<p class="MsoNormal"> self._sock.sendall(view[write_offset:write_offset+buffer_size])<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall<o:p></o:p></p>
<p class="MsoNormal"> tail = self.send(data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send<o:p></o:p></p>
<p class="MsoNormal"> return self._send_loop(self.fd.send, data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 371, in _send_loop<o:p></o:p></p>
<p class="MsoNormal"> return send_method(data, *args)<o:p></o:p></p>
<p class="MsoNormal">error: [Errno 32] Broken pipe<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.757 664583 INFO eventlet.wsgi.server [-] 10.195.65.69,<local> "GET /latest/meta-data/instance-id HTTP/1.0" status: 200 len: 0 time: 63.0419171<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent [-] Unexpected error.: MessagingTimeout: Timed out waiting for a reply to message ID 77551d4cf4394b7b9cdfad68c0be46e8<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent Traceback (most recent call last):<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 89, in __call__<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent instance_id, tenant_id = self._get_instance_and_tenant_id(req)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 162, in _get_instance_and_tenant_id<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent ports = self._get_ports(remote_address, network_id, router_id)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 155, in _get_ports<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent return self._get_ports_for_remote_address(remote_address, networks)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/common/cache_utils.py", line 116, in __call__<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent return self.func(target_self, *args, **kwargs)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 137, in _get_ports_for_remote_address<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent ip_address=remote_address)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 106, in _get_ports_from_server<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent return self.plugin_rpc.get_ports(self.context, filters)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 72, in get_ports<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent return cctxt.call(context, 'get_ports', filters=filters)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/common/rpc.py", line 173, in call<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent time.sleep(wait)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent self.force_reraise()<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent six.reraise(self.type_, self.value, self.tb)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/common/rpc.py", line 150, in call<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent return self._original_context.call(ctxt, method, **kwargs)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 179, in call<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent retry=self.retry)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 133, in _send<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent retry=retry)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent call_monitor_timeout, retry=retry)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 573, in _send<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent call_monitor_timeout)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in wait<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent message = self.waiters.get(msg_id, timeout=timeout)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in get<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent 'to message ID %s' % msg_id)<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent MessagingTimeout: Timed out waiting for a reply to message ID 77551d4cf4394b7b9cdfad68c0be46e8<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.776 664595 ERROR neutron.agent.metadata.agent<o:p></o:p></p>
<p class="MsoNormal">2019-12-06 17:59:23.778 664595 INFO eventlet.wsgi.server [-] Traceback (most recent call last):<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 521, in handle_one_response<o:p></o:p></p>
<p class="MsoNormal"> write(b''.join(towrite))<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 462, in write<o:p></o:p></p>
<p class="MsoNormal"> wfile.flush()<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/socket.py", line 307, in flush<o:p></o:p></p>
<p class="MsoNormal"> self._sock.sendall(view[write_offset:write_offset+buffer_size])<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 390, in sendall<o:p></o:p></p>
<p class="MsoNormal"> tail = self.send(data, flags)<o:p></o:p></p>
<p class="MsoNormal"> File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 384, in send<o:p></o:p></p>
<p class="MsoNormal"> return self._send_loop(self.fd.send, data, flags)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">What’s causing this? Are we overloading RMQ?<o:p></o:p></p>
</div>
</body>
</html>