[Openstack-operators] Neutron crashed hard

Erik McCormick emccormick at cirrusseven.com
Thu Dec 19 04:37:57 UTC 2013


It sounds more to me like your database went awol than a neutron problem.
Assuming you had done a bit of mucking around testing the cluster before
this event, is there any chance you're not using memcached and your tokens
table has grown large? You might want to switch over to memcached for
Keystone and see if that doesn't make it happier.
On Dec 18, 2013 9:40 PM, "Joe Topjian" <joe at topjian.net> wrote:

> Hello,
>
> I set up an internal OpenStack cloud to give a workshop for around 15
> people. I decided to use Neutron as I'm trying to get more experience with
> it. The cloud consisted of a cloud controller and four compute nodes. Very
> decent Dell hardware, Ubuntu 12.04, Havana 2013.2.0.
>
> Neutron was configured with the OVS plugin, non-overlapping IPs, and a
> single shared subnet. GRE tunnelling was used between compute nodes.
>
> Everything was working fine until the 15 people tried launching a CirrOS
> instance at approximately the same time.
>
> Then Neutron crashed.
>
> The compute nodes had this in their logs:
>
> 2013-12-18 09:52:57.707 28514 TRACE nova.compute.manager ConnectionFailed:
> Connection to neutron failed: timed out
>
> All instances went into an Error state.
>
> Restarting the Neutron services did no good. Terminating the Error'd
> instances seemed to make the problem worse -- the entire cloud became
> unavailable (meaning, both Horizon and Nova were unusable as they would
> time out waiting for Neutron).
>
> We moved on to a different cloud to continue on with the workshop. I would
> occasionally issue "neutron net-list" in the original cloud to see if I
> would get a result. It took about an hour.
>
> What happened?
>
> I've read about Neutron performance issues -- would this be something
> along those lines?
>
> What's the best way to quickly recover from a situation like this?
>
> Since then, I haven't recreated the database, networks, or anything like
> that. Is there a specific log or database table I can look for to see more
> information on how exactly this situation happened?
>
> Thanks,
> Joe
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20131218/23b438d5/attachment.html>


More information about the OpenStack-operators mailing list