[openstack-dev] [nova] How to deal the rpc timeout between compute and conductor?
chenrui.momo at gmail.com
Mon Mar 23 10:00:43 UTC 2015
I deploy my OpenStack with VMware driver, one nova-compute connect to
there are about 3000 VMs in VMware deployment. I use mysql. The method
rasie rpc timeout error when ComputeManager.init_host() and
_sync_power_states periodic task execute.
Currently, one nova-compute host map to the whole VMware deployment
that maybe contain several clusters
in nova VMware driver. When InstanceList.get_by_host execute in
ComputeManager, it indicate that nova-compute
will execute a rpc call to nova-conducutor, nova-conductor will fetch a
lots of instances in the whole VMware
deployment in once, in my case , it's 3000 instances. The long time SQL
query maybe lead to the rpc timeout
from nova-compute to nova-conductor. We only face the issue in VMWare
In the patch I split the large rpc request to multiple small rpc requests
using pagination mechanism in order to
fix this issue, but sahid think it looks like a hack and need a real
pattern to handle this problem.
If you have other better idea, please let me know.
Feel free to discuss it. Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev