[openstack-dev] [oslo] rabbitmq timeout

Ajay Kalambur (akalambu) akalambu at cisco.com
Tue Nov 15 19:21:49 UTC 2016


Hi
We see an issue where if we boot 10 instances on 1 compute node at a time we see that sometimes a few instances error out on compute node with following Traceback
It looks like the RPC message call from compute saying set Instance to Active is timing out.  It seems to help to set rpc response timeout to 180 seconds. But what we see from TCP dump is that the message reaches the conductor and conductor replies almost instantly. It looks like the compute node in question somehow takes a long time to process it. Any pointers. Also we had to increase heartbeat timeout to 120 from 60 to make this scenario work better. Any inputs on what maybe going on here?




2016-11-14 04:44:19.518 6 INFO nova.virt.libvirt.driver [-] [instance: 31e67591-6bd4-4df9-bba1-7a1d6f785be0] Instance spawned successfully.

2016-11-14 04:44:55.692 6 INFO nova.compute.manager [req-25dc826e-54bc-4c7c-b58f-c32c8e024624 - - - - -] [instance: 31e67591-6bd4-4df9-bba1-7a1d6f785be0] During sync_power_state the instance has a pending task (spawning). Skip.

2016-11-14 04:44:55.693 6 INFO nova.compute.manager [req-25dc826e-54bc-4c7c-b58f-c32c8e024624 - - - - -] [instance: 31e67591-6bd4-4df9-bba1-7a1d6f785be0] VM Started (Lifecycle Event)

2016-11-14 04:45:04.544 6 INFO nova.virt.libvirt.driver [-] [instance: 5040fdad-2dbe-49a2-aa3d-ccd8e7725b5d] Instance spawned successfully.

2016-11-14 04:45:04.558 6 INFO nova.virt.libvirt.driver [-] [instance: ba4b3c89-0f57-4c46-91dd-44434012d289] Instance spawned successfully.

2016-11-14 04:45:04.607 6 INFO nova.compute.manager [req-25dc826e-54bc-4c7c-b58f-c32c8e024624 - - - - -] [instance: 5040fdad-2dbe-49a2-aa3d-ccd8e7725b5d] VM Resumed (Lifecycle Event)

2016-11-14 04:45:16.669 6 INFO nova.virt.libvirt.driver [-] [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c] Instance spawned successfully.

2016-11-14 04:45:16.710 6 INFO nova.compute.manager [req-25dc826e-54bc-4c7c-b58f-c32c8e024624 - - - - -] [instance: 5040fdad-2dbe-49a2-aa3d-ccd8e7725b5d] VM Started (Lifecycle Event)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [req-20ec8e07-d57c-4679-8122-4cfa990b6d6f 168b323822284084b1c1393faeb5b9e1 aa39823c34e4496793250c0bc5cf7a31 - - -] [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c] Unexpected build failure, not rescheduling build.

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c] Traceback (most recent call last):

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     filter_properties)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     instance.save(expected_task_state=task_states.SPAWNING)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 197, in wrapper

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     ctxt, self, fn.__name__, args, kwargs)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 242, in object_action

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     objmethod=objmethod, args=args, kwargs=kwargs)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 158, in call

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     retry=self.retry)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     timeout=timeout, retry=retry)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 431, in send

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     retry=retry)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 420, in _send

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     result = self._waiter.wait(msg_id, timeout)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 318, in wait

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     message = self.waiters.get(msg_id, timeout=timeout)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 223, in get

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]     'to message ID %s' % msg_id)

2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c] MessagingTimeout: Timed out waiting for a reply to message ID 7cb74c7df8e148e9bd3d3cee2e433b70


2016-11-14 04:46:43.765 6 ERROR nova.compute.manager [instance: 3794b761-f623-4fb9-a91a-f2dffca42c4c]




Ajay

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20161115/a7baf8f9/attachment.html>


More information about the OpenStack-dev mailing list