[Openstack] Restore nova instances after Ceph crash

Eugen Block eblock at nde.ag
Wed Jun 15 15:14:04 UTC 2016


Hi,

this is no question for you guys, it's more of an information what I  
was facing today and I wanted to share it if any of you should be  
facing the same or similar issue.

Due to an incautious upload of several huge images to glance (backed  
by Ceph) my colleague crashed our Ceph cluster, the OSDs were full or  
near full, so the glance images got stuck in status "saving". But the  
more severe impact was that the already running instances with their  
root disk in Ceph were not reachable anymore, nova-compute also shut  
down.
So we had to get Ceph to a healthy state first, we added some OSDs,  
the cluster recovered successfully. Now I wanted to boot the instances  
again, basically that worked. Only the instances had no network  
interface although Horizon still showed the IP address information in  
the dashboard, so I assumed the information still had to be somewhere,  
but virsh dumpxml also didn't contain any interface data. A reboot of  
my control and compute nodes didn't fix anything, although new  
instances were created successfully, even in the respective network  
the troublesome instances were supposed to boot in.

I took a look into the port-list, the ports were still there. So I  
tried detaching and reattaching the instance's interface. Detaching  
seemed to work, but a refresh of the page or a "nova interface-list  
<INSTACE>" still showed the same IP address. As a workaround I  
attached a new interface in the same subnet temporarily, detached it  
again and then I saw what I wanted to see, NO IP address. Then I  
reattached the original port via nova CLI (Horizon doesn't let you  
choose a specific port, at least I didn't find it yet), and then my  
instances had back their interfaces. What I was also facing was that  
some ports seemed to be available, but when I wanted to attach them, I  
got an error message saying that port didn't exist. So I created new  
ports with the same IP addresses like the original ports, attached  
them and now I have a working cloud and a healthy cluster again.

It took me a couple of hours to figure that out, I hope this helps  
anyone! I think it's better than rebuild your instances... ;-)

Best regards,
Eugen

-- 
Eugen Block                             voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg                         e-mail  : eblock at nde.ag

         Vorsitzende des Aufsichtsrates: Angelika Mozdzen
           Sitz und Registergericht: Hamburg, HRB 90934
                   Vorstand: Jens-U. Mozdzen
                    USt-IdNr. DE 814 013 983





More information about the Openstack mailing list