[Openstack] 答复: 答复: Ceilometer high availability in active-active
Pan, Fengyun
panfy.fnst at cn.fujitsu.com
Wed Mar 18 03:16:07 UTC 2015
Thank you !
I have set the backend_url of compute node and controller node as follows:
backend_url=redis://193.168.196.246:6379
the ip of my compute node is "193.168.196.246".
And compute node have installed redis.
# rpm -qa | grep redis
redis-2.8.15-2.el7ost.x86_64
python-redis-2.10.3-1.el7ost.noarch
so running ceilometer-agent-central service on compute node , it can connect to redis service successfully.
But when running ceilometer-agent-central service on controller onde, we will get the log as follows:
______________________________
2015-03-18 18:48:05.948 16236 ERROR ceilometer.coordination [-] Error connecting to coordination backend.
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination Traceback (most recent call last):
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 70, in start
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination self._coordinator.start()
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 182, in start
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination self._start()
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 354, in _start
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination self._server_info = self._client.info()
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination self.gen.throw(type, value, traceback)
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 78, in _translate_failures
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination raise coordination.ToozConnectionError(utils.exception_message(e))
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination ToozConnectionError: Error 113 connecting to 193.168.196.246:6379. EHOSTUNREACH.
2015-03-18 18:48:05.948 16236 TRACE ceilometer.coordination
2015-03-18 18:48:36.953 16236 ERROR ceilometer.coordination [-] Error sending a heartbeat to coordination backend.
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination Traceback (most recent call last):
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 86, in heartbeat
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination self._coordinator.heartbeat()
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 408, in heartbeat
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination value=b"Not dead!")
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination self.gen.throw(type, value, traceback)
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 78, in _translate_failures
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination raise coordination.ToozConnectionError(utils.exception_message(e))
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination ToozConnectionError: Error connecting to 193.168.196.246:6379. timed out.
2015-03-18 18:48:36.953 16236 TRACE ceilometer.coordination
_____________________________
Why is it timed out ?
Is there some trouble in my configuration?
-----邮件原件-----
发件人: Chris Dent [mailto:chdent at redhat.com]
发送时间: 2015年3月11日 21:08
收件人: Pan, Fengyun/潘 风云
抄送: Vijaya Bhaskar; openstack
主题: Re: 答复: [Openstack] Ceilometer high availability in active-active
On Wed, 11 Mar 2015, Pan, Fengyun wrote:
> We kown that:
> backend_url',
> default=None,
> help='The backend URL to use for distributed coordination. If '
> 'left empty, per-deployment central agent and per-host '
> 'compute agent won\'t do workload '
> 'partitioning and will only function correctly if a '
> 'single instance of that service is running.'), But
> how to set the ‘backend_url’?
This appears to be an oversight in the documentation. The main starting point is here:
http://docs.openstack.org/admin-guide-cloud/content/section_telemetry-cetral-compute-agent-ha.html
but nothing there nor what it links to actually says what should go as the value of the setting. It's entirely dependent on the backend being used and how that backend is being configured. Each of the tooz drivers has some information on some of the options, but again, it is not fully documented yet.
For reference, what I use in my own testing is redis as follows:
redis://localhost:6379
This uses a single redis server, so introduces another single point of failure. It's possible to use sentinel to improve upon this situation:
http://docs.openstack.org/developer/tooz/developers.html#redis
The other drivers work in similar ways with their own unique arguments.
I'm sorry I'm not able to point to more complete information but I can say that it is in the process of being improved.
--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent
More information about the Openstack
mailing list