[Openstack-operators] Can't boot image getting nova.api.openstack MessagingTimeout: Timed out waiting for a reply to message ID msg_id

Kris G. Lindgren klindgren at godaddy.com
Tue Nov 24 16:12:13 UTC 2015


Evan,

I would recommend restarting anything that takes to rabbitmq.  This is a case of your rpc workers thinking that they are connected to rabbitmq but they really aren't.  I assume that this setup use to work at one time and now doesn’t?

___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Evan <evandeng2009 at qq.com<mailto:evandeng2009 at qq.com>>
Date: Tuesday, November 24, 2015 at 5:02 AM
To: OpenStack-operators <OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>>
Subject: [Openstack-operators] Can't boot image getting nova.api.openstack MessagingTimeout: Timed out waiting for a reply to message ID msg_id

hi,

I keep getting such error(as subject) when trying to boot an image for instance, no matter from webpage dashboard or cli. firstly it show the status of scheduling and after a while it's Error. I checked status of all service ,nothing gets failed. Below are the logs for nova-api and nova-conductor.

It seems like the problem of rabbitmq, but it's running well without any explicit error shown. Once I tried to restart the rabbitmq then instance created successfully. Before this case happens I tried to set a 2nd external network on Network node but failed so I rolled it back(changed configs back and restart related services). Also checked those common CLI they're all good like nova image-list, nova service-list,neutron agent-list ,keystone service-list.

Openstack-Juno, with 1 controller, 1 network, 3 compute, 1 block and 1 object, servers have much more resources than needed.


controller# grep MessagingTimeout /var/log/nova/nova-api.log
......
2015-11-24 15:58:00.404 18197 TRACE nova.api.openstack MessagingTimeout: Timed out waiting for a reply to message ID 039995083875445bacdd3bd8667cda51
2015-11-24 15:59:02.618 18204 TRACE nova.api.openstack MessagingTimeout: Timed out waiting for a reply to message ID 58c3ea4f3c2d43699632c1999df79e91
2015-11-24 18:58:28.030 18201 TRACE nova.api.openstack MessagingTimeout: Timed out waiting for a reply to message ID 6b43bf60e2cd494fab116515c5997ab5

controller# grep ERROR /var/log/nova/nova-api.log
......
2015-11-24 18:57:27.003 18201 ERROR oslo.messaging._drivers.impl_rabbit [req-2b810cc0-d809-4f2c-8a01-d1f7eca62b48 ] Failed to consume message from queue: [Errno 104] Connection reset by peer
2015-11-24 18:58:28.030 18201 ERROR nova.api.openstack [req-2b810cc0-d809-4f2c-8a01-d1f7eca62b48 None] Caught error: Timed out waiting for a reply to message ID 6b43bf60e2cd494fab116515c5997ab5
2015-11-24 19:07:26.591 18200 ERROR oslo.messaging._drivers.impl_rabbit [req-f532b095-5f37-432d-8cdc-06db8498b3c1 ] Failed to publish message to topic 'conductor': [Errno 104] Connection reset by peer
2015-11-24 19:22:36.310 18183 ERROR oslo.messaging._drivers.impl_rabbit [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa ] Failed to publish message to topic 'conductor': [Errno 104] Connection reset by peer

# tail -n 80 /var/log/nova/nova-conductor.log
......
2015-11-24 19:22:37.356 17770 TRACE oslo.messaging._drivers.impl_rabbit   File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 342, in send
2015-11-24 19:22:37.356 17770 TRACE oslo.messaging._drivers.impl_rabbit     total_sent += fd.send(data[total_sent:], flags)
2015-11-24 19:22:37.356 17770 TRACE oslo.messaging._drivers.impl_rabbit error: [Errno 104] Connection reset by peer
2015-11-24 19:22:37.356 17770 TRACE oslo.messaging._drivers.impl_rabbit
2015-11-24 19:22:37.357 17770 INFO oslo.messaging._drivers.impl_rabbit [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa ] Delaying reconnect for 1.0 seconds...
2015-11-24 19:22:38.358 17770 INFO oslo.messaging._drivers.impl_rabbit [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa ] Connecting to AMQP server on controller:5672
2015-11-24 19:22:38.379 17770 INFO oslo.messaging._drivers.impl_rabbit [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa ] Connected to AMQP server on controller:5672
2015-11-24 19:23:38.385 17770 ERROR nova.scheduler.driver [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa None] Exception during scheduler.run_instance
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver Traceback (most recent call last):
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 616, in build_instances
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     request_spec, filter_properties)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 49, in select_destinations
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     context, request_spec, filter_properties)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 35, in __run_method
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     return getattr(self.instance, __name)(*args, **kwargs)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 34, in select_destinations
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     context, request_spec, filter_properties)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 108, in select_destinations
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     request_spec=request_spec, filter_properties=filter_properties)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     retry=self.retry)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     timeout=timeout, retry=retry)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 408, in send
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     retry=retry)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 397, in _send
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     result = self._waiter.wait(msg_id, timeout)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 285, in wait
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     reply, ending = self._poll_connection(msg_id, timeout)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 235, in _poll_connection
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver     % msg_id)
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver MessagingTimeout: Timed out waiting for a reply to message ID d015995266a24e05b21ff0cd2ff3c6bc
2015-11-24 19:23:38.385 17770 TRACE nova.scheduler.driver
2015-11-24 19:23:38.388 17770 WARNING nova.scheduler.driver [req-24c4ce2a-6516-401e-a6f8-8a89758cc6fa None] [instance: 66147c0d-fa53-4e2b-8447-d4f20ac7bcbb] Setting instance to ERROR state.

controller# systemctl status rabbitmq-server
rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled)
   Active: active (running) since Tue 2015-11-24 15:02:41 CST; 4h 43min ago
  Process: 137583 ExecStop=/usr/lib/rabbitmq/bin/rabbitmqctl stop (code=exited, status=0/SUCCESS)
 Main PID: 137639 (beam.smp)
   CGroup: /system.slice/rabbitmq-server.service
           ├─137639 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_s...
           ├─137654 /usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon
           ├─137767 inet_gethost 4
           └─137768 inet_gethost 4

Nov 24 15:02:40 controller systemd[1]: rabbitmq-server.service: Got notification message from PID 137701, but reception only permitted for PID 137639
Nov 24 15:02:40 controller systemd[1]: rabbitmq-server.service: Got notification message from PID 137702, but reception only permitted for PID 137639
Nov 24 15:02:41 controller rabbitmq-server[137639]: RabbitMQ 3.3.5. Copyright (C) 2007-2014 GoPivotal, Inc.
Nov 24 15:02:41 controller rabbitmq-server[137639]: ##  ##      Licensed under the MPL.  See http://www.rabbitmq.com/
Nov 24 15:02:41 controller rabbitmq-server[137639]: ##  ##
Nov 24 15:02:41 controller rabbitmq-server[137639]: ##########  Logs: /var/log/rabbitmq/rabbit at controller.log<mailto:/var/log/rabbitmq/rabbit at controller.log>
Nov 24 15:02:41 controller rabbitmq-server[137639]: ######  ##        /var/log/rabbitmq/rabbit at controller-sasl.log<mailto:/var/log/rabbitmq/rabbit at controller-sasl.log>
Nov 24 15:02:41 controller rabbitmq-server[137639]: ##########
Nov 24 15:02:41 controller rabbitmq-server[137639]: Starting broker... completed with 6 plugins.
Nov 24 15:02:41 controller systemd[1]: Started RabbitMQ broker.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20151124/7d0e7e82/attachment.html>


More information about the OpenStack-operators mailing list