[Openstack-operators] [Openstack] Recovering from full outage

Torin Woltjer torin.woltjer at granddial.com
Thu Jul 5 12:43:43 UTC 2018


There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/2d482d3a/attachment.html>


More information about the OpenStack-operators mailing list