[Openstack-operators] [oslo]nova compute reconnection Issue Kilo

Ajay Kalambur (akalambu) akalambu at cisco.com
Thu Apr 21 17:43:42 UTC 2016


Hi
I am seeing on Kilo if I bring down one contoller node sometimes some computes report down forever.
I need to restart the compute service on compute node to recover. Looks like oslo is not reconnecting in nova-compute
Here is the Trace from nova-compute
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     retry=self.retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     timeout=timeout, retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     result = self._waiter.wait(msg_id, timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     message = self.waiters.get(msg_id, timeout=timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db     'to message ID %s' % msg_id)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db MessagingTimeout: Timed out waiting for a reply to message ID e064b5f6c8244818afdc5e91fff8ebf1


Any thougths. I am at stable/kilo for oslo

Ajay

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160421/5a077fa4/attachment.html>


More information about the OpenStack-operators mailing list