[Openstack] controller node nova-compute error after reboot system.

王鹏 breakwindwp at gmail.com
Mon Aug 27 09:34:04 UTC 2012


 Hi!Everyone!
Finally,I finished the installation of openstack in 3nodes(two compute node
and one all-in-one node as controller)
But,when I reboot system,the controller's nova-compute service(with
instance) is error:
nova-compute.log:
2012-08-22 19:28:51 INFO nova.rpc.common [req-d38eb14e-

dfd8-4c6a-ae26-fb1e522765eb None None] Connected to AMQP server on
172.18.32.7:5672
2012-08-22 19:29:51 ERROR nova.rpc.common
[req-d38eb14e-dfd8-4c6a-ae26-fb1e522765eb None None] Timed out waiting for
RPC response: timed out
2012-08-22 19:29:51 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in
ensure
2012-08-22 19:29:51 TRACE nova.rpc.common return method(*args, **kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in
_consume
2012-08-22 19:29:51 TRACE nova.rpc.common return
self.connection.drain_events(timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return
self.transport.drain_events(self.connection, **kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238,
in drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return
connection.drain_events(**kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57,
in drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return
self.wait_multi(self.channels.values(), timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63,
in wait_multi
2012-08-22 19:29:51 TRACE nova.rpc.common chanmap.keys(), allowed_methods,
timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120,
in _wait_multiple
2012-08-22 19:29:51 TRACE nova.rpc.common channel, method_sig, args,
content = read_timeout(timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94,
in read_timeout
2012-08-22 19:29:51 TRACE nova.rpc.common return
self.method_reader.read_method()
2012-08-22 19:29:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py",
line 221, in read_method
2012-08-22 19:29:51 TRACE nova.rpc.common raise m
2012-08-22 19:29:51 TRACE nova.rpc.common timeout: timed out
2012-08-22 19:29:51 TRACE nova.rpc.common
2012-08-22 19:29:52 CRITICAL nova [-] Timeout while waiting on RPC response.

Only controller nova-compute service is error,two computenodes is ok.
It makes me very puzzled.
Is it RabbitMQ close?but,other nodes is ok.
so ,I find RabbitMQ log:
=INFO REPORT==== 19-Aug-2012::15:51:33 ===
:$
closing TCP connection <0.240.0> from 172.18.32.7:50592

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.229.0> from 172.18.32.7:50587
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.229.0> from 172.18.32.7:50587

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.262.0> from 172.18.32.7:50594
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.262.0> from 172.18.32.7:50594

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.306.0> from 172.18.32.7:50600
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.306.0> from 172.18.32.7:50600

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.330.0> from 172.18.32.7:50601
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.330.0> from 172.18.32.7:50601

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.250.0> from 172.18.32.7:50593
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.250.0> from 172.18.32.7:50593

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
Limiting to approx 924 file handles (829 sockets)

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
Memory limit set to 1579MB of 3948MB total.

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index

=WARNING REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_persistent: rebuilding indices from scratch
I don‘t find reason.

That's my config file:
[DEFAULT]
###### LOGS/STATE
#verbose=True
verbose=False

###### AUTHENTICATION
auth_strategy=keystone

###### SCHEDULER
#--compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
scheduler_driver=nova.scheduler.simple.SimpleScheduler

###### VOLUMES
volume_group=nova-volumes
volume_name_template=volume-%08x
iscsi_helper=tgtadm
iscsi_ip_prefix=172.18.32

###### DATABASE
sql_connection=mysql://nova:*****@172.18.32.7/nova

###### COMPUTE
libvirt_type=kvm
#libvirt_type=qemu
connection_type=libvirt
instance_name_template=instance-%08x
api_paste_config=/etc/nova/api-paste.ini
allow_resize_to_same_host=True
libvirt_use_virtio_for_bridges=true
start_guests_on_host_boot=true
resume_guests_state_on_host_boot=true

###### APIS
osapi_compute_extension=nova.api.openstack.compute.contrib.standard_extensions
allow_admin_api=true
s3_host=172.18.32.7
cc_host=172.18.32.7

###### RABBITMQ
rabbit_host=172.18.32.7

###### GLANCE
image_service=nova.image.glance.GlanceImageService
glance_api_servers=172.18.32.7:9292

###### NETWORK
network_manager=nova.network.manager.FlatDHCPManager
force_dhcp_release=True
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
firewall_driver=nova.virt.libvirt.firewall.IptablesFirewallDriver
public_interface=eth0
flat_interface=eth1
flat_network_bridge=br100
fixed_range=172.18.32.160/27
multi_host=true

###### NOVNC CONSOLE
novnc_enabled=true
novncproxy_base_url= http://172.18.32.7:6080/vnc_auto.html
vncserver_proxyclient_address=172.18.32.7
vncserver_listen=172.18.32.7

########Nova
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova

#####MISC
use_deprecated_auth=false
root_helper=sudo nova-rootwrap

when I stop compute node nova-compute service
reboot system again,the controller node is ok!

So,I doubt that is RabbitMQ queue problem.
then,I add follow lines in nova.conf:
rabbit_durable_queues=ture
rabbit_max_retries=3
rabbit_retry_backoff=5
rabbit_retry_interval=3

restart nova-compute
but then computenode is error
why?

Regards,
WangPeng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120827/3cbd80ad/attachment.html>


More information about the Openstack mailing list