[Openstack] Restore nova instances after Ceph crash

John Favorite john.favorite at gmail.com
Wed Jun 15 17:06:14 UTC 2016


Awesome. Thx for sharing!

On Wed, Jun 15, 2016, 11:26 AM Eugen Block <eblock at nde.ag> wrote:

> Hi,
>
> this is no question for you guys, it's more of an information what I
> was facing today and I wanted to share it if any of you should be
> facing the same or similar issue.
>
> Due to an incautious upload of several huge images to glance (backed
> by Ceph) my colleague crashed our Ceph cluster, the OSDs were full or
> near full, so the glance images got stuck in status "saving". But the
> more severe impact was that the already running instances with their
> root disk in Ceph were not reachable anymore, nova-compute also shut
> down.
> So we had to get Ceph to a healthy state first, we added some OSDs,
> the cluster recovered successfully. Now I wanted to boot the instances
> again, basically that worked. Only the instances had no network
> interface although Horizon still showed the IP address information in
> the dashboard, so I assumed the information still had to be somewhere,
> but virsh dumpxml also didn't contain any interface data. A reboot of
> my control and compute nodes didn't fix anything, although new
> instances were created successfully, even in the respective network
> the troublesome instances were supposed to boot in.
>
> I took a look into the port-list, the ports were still there. So I
> tried detaching and reattaching the instance's interface. Detaching
> seemed to work, but a refresh of the page or a "nova interface-list
> <INSTACE>" still showed the same IP address. As a workaround I
> attached a new interface in the same subnet temporarily, detached it
> again and then I saw what I wanted to see, NO IP address. Then I
> reattached the original port via nova CLI (Horizon doesn't let you
> choose a specific port, at least I didn't find it yet), and then my
> instances had back their interfaces. What I was also facing was that
> some ports seemed to be available, but when I wanted to attach them, I
> got an error message saying that port didn't exist. So I created new
> ports with the same IP addresses like the original ports, attached
> them and now I have a working cloud and a healthy cluster again.
>
> It took me a couple of hours to figure that out, I hope this helps
> anyone! I think it's better than rebuild your instances... ;-)
>
> Best regards,
> Eugen
>
> --
> Eugen Block                             voice   : +49-40-559 51 75
> NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
> Postfach 61 03 15
> D-22423 Hamburg                         e-mail  : eblock at nde.ag
>
>          Vorsitzende des Aufsichtsrates: Angelika Mozdzen
>            Sitz und Registergericht: Hamburg, HRB 90934
>                    Vorstand: Jens-U. Mozdzen
>                     USt-IdNr. DE 814 013 983
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20160615/05f50ba5/attachment.html>


More information about the Openstack mailing list