[Openstack-operators] Rabbitmq server issues
Linux Datacenter
linuxdatacenter at gmail.com
Mon Aug 8 07:49:24 UTC 2011
Hi,
It looks like my rabbitmq server on nova main node keeps crashing. I keep
getting messages like this on my compute nodes:
2011-08-08 09:16:31,816 ERROR nova.rpc [-] Failed to fetch message from
queue: (320, u"CONNECTION_FORCED - broker forced connection closure with
reason 'shutdown'", (0, 0), '')
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line
126, in fetch
(nova.rpc): TRACE: super(Consumer, self).fetch(no_ack, auto_ack,
enable_callbacks)
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/carrot/messaging.py", line 304, in fetch
(nova.rpc): TRACE: message = self.backend.get(self.queue, no_ack=no_ack)
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/carrot/backends/pyamqplib.py", line 252, in
get
(nova.rpc): TRACE: raw_message = self.channel.basic_get(queue,
no_ack=no_ack)
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/amqplib/client_0_8/channel.py", line 2032, in
basic_get
(nova.rpc): TRACE: (60, 72), # Channel.basic_get_empty
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/amqplib/client_0_8/abstract_channel.py", line
89, in wait
(nova.rpc): TRACE: self.channel_id, allowed_methods)
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/amqplib/client_0_8/connection.py", line 218,
in _wait_method
(nova.rpc): TRACE: self.wait()
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/amqplib/client_0_8/abstract_channel.py", line
105, in wait
(nova.rpc): TRACE: return amqp_method(self, args)
(nova.rpc): TRACE: File
"/usr/lib/pymodules/python2.6/amqplib/client_0_8/connection.py", line 367,
in _close
(nova.rpc): TRACE: raise AMQPConnectionException(reply_code, reply_text,
(class_id, method_id))
(nova.rpc): TRACE: AMQPConnectionException: (320, u"CONNECTION_FORCED -
broker forced connection closure with reason 'shutdown'", (0, 0), '')
Also their status in "nova-manage service list" is: nova-compute enabled
XXX
When I restart the rabbitmq server, I get this one:
2011-08-08 09:16:34,809 ERROR nova.rpc [-] Reconnected to queue
2011-08-08 09:16:34,810 ERROR nova.rpc [-] Reconnected to queue
2011-08-08 09:16:34,811 ERROR nova.rpc [-] Reconnected to queue
Looks like the node is reconnected, but its status is still XXX in
nova-compute.
Can anyone give me a reasonable remedy for this issue? (the first one I can
think of is a periodic restart of the rabbitmq server and nova-compute
daemons on all my servers).
PS.
Searching google for "nova-compute XXX" may render different results
depending on your parental filter settings ;-)
So it might be a good idea to change it to "OK" or whatever ;-)
Regards,
-Piotr
--
checkout my blog on linux clusters:
-- linuxdatacenter.blogspot.com --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20110808/9e8fd896/attachment-0002.html>
More information about the Openstack-operators
mailing list