Open Stack

Mon Mar 31 09:00:02 UTC 2014

In my openstack cluster (Havana), occasionally, i have some instances can
not be ping if reboot many instances simultaneuously.
if i "nova console-log ...", i see a line:
"*cloud-init-nonet waiting 120 seconds for a network device.*"
To fix this, i must run "*service neutron-plugin-openvswitch-agent restart*"
on compute-nodes that these instances is hosted.
After restart neutron-agent, i can ping these instances normally.

Therefore, I build another openstack-cluster (Havana) to test, but the
problem persist (i test with both VLAN and GRE, both soft_reboot and
hard_reboot - but this problem is the same!)
# for server_id in `nova list | grep ACTIVE | awk '{print $2}'`;do echo
$server_id ; nova reboot --hard $server_id; done
or
# for server_id in `nova list | grep ACTIVE | awk '{print $2}'`;do echo
$server_id ; nova reboot $server_id; done

Check console-log:
# for server_id in `nova list | grep ACTIVE | awk '{print $2}'`;do echo
$server_id ; nova console-log $server_id | grep "waiting 120 seconds";
done

cea03292-5c76-45dd-af35-b9d80b82ac5d
37b02ca8-98fc-40c7-a700-f7942d134b65
a92051b1-3a1d-4f68-ba77-4d093e916ce4
*cloud-init-nonet waiting 120 seconds for a network device.*
def8f657-572a-4431-902a-c983d0aad7a9
*cloud-init-nonet waiting 120 seconds for a network device.*

And I can ping 2 instances: a92051b1-3a1d-4f68-ba77-4d093e916ce4 &
def8f657-572a-4431-902a-c983d0aad7a9

This bug exists quite along time in neutron-agent, but still not be fixed.

Can you suggest me a quick fix for that?
Thank you very much!

Best Regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140331/5bab9950/attachment.html>

Open Stack

[Openstack] [Neutron-agent] Some instances can not ping on mass reboot

OpenStack

Community

Documentation

Branding & Legal