[Openstack-operators] Can't reboot instance because the source Image was deleted - Grizzly - Ubuntu 12.04

George Shuklin george.shuklin at gmail.com
Tue May 27 22:07:24 UTC 2014


Em... I think it's not bug. May be some kind of inconvenience,  and 
better error logging is always welcome, but error message is correct: 
image is not found. Not in glance, but inside of nova internals.

For everyone, who know how nova works it perfectly clear. For new users 
is unclear, but this is 'wishlist', not a bug.

P. S. As far as I hear, Nova is really smart and will attempt to 
redownload image from glance if _base part is deleted.


On 05/28/2014 12:25 AM, Juan José Pavlik Salles wrote:
> By the way... there's something we can learn from this miss 
> understanding. The error message:
>
> 2014-05-27 15:23:45.002 ERROR nova.compute.manager 
> [req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc 
> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dcbf1e96d6] 
> [instance: b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot reboot 
> instance: Image 39baad54-6ce1-4f42-b431-1bac4fd6df28 could not be found.
>
> is wrong and even useless because it points to the wrong direction. 
> The problem here wasn't glance image at all, there was a part of the 
> qcow drive missing. Should I fill a bug with this problem? I'm on fire 
> today hahaha.
>
>
> 2014-05-27 18:19 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com 
> <mailto:jjpavlik at gmail.com>>:
>
>     I'm not proud of this... somehow George was right. Last week we
>     migrated our instances from gfs2 volume to ocfs2 one and we copied
>     "all" the files from one volume to the other, we mounted the new
>     one and started the VMs. BUT... it seems that a few files were
>     lost during the last node failure and the files that were supposed
>     to be in _base dir weren't there (this is an awkward answer I'll
>     have to improve before telling my boss about this). You can see it
>     here:
>
>      root at cebolla:/var/lib/nova# ll instances/_base/
>     instances_17_05_2014/_base/
>     instances_17_05_2014/_base/:
>     total 6572308
>     drwxr-xr-x  2 nova         nova       4096 may 17 20:50 ./
>     drwxr-xr-x 27 root         root       4096 may 17 20:57 ../
>     -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>     1cfaaa19259a9538efb89dd674645af7ad334322
>     -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>     6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd
>     -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>     99edbbef0de23ac4ed20015ded60887690444661
>     -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>     d04d963a4efa93ecacaadc272ab841c1dd901c9f
>     -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>     -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>
>     instances/_base/:
>     total 2424832
>     drwxr-xr-x  2 nova         nova       3896 may 27 18:02 ./
>     drwxr-xr-x 28 nova         nova       3896 may 27 17:45 ../
>     -rw-r--r--  1 nova         nova 2147483648 may 27 17:34
>     1cfaaa19259a9538efb89dd674645af7ad334322
>     -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>     -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>     root at cebolla:/var/lib/nova#
>
>     Before that I checked that the qcow disk of the instances were
>     being backed up by a file that didn't exist at all!!!:
>
>     root at cebolla:/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
>     qemu-img info disk
>     image: disk
>     file format: qcow2
>     virtual size: 10G (10737418240 bytes)
>     disk size: 2.6G
>     cluster_size: 65536
>     backing file:
>     */var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661*
>     (actual path:
>     /var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661)
>     root at cebolla:/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
>
>     Basically, I copied the missing files from the older volume
>     (6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd,
>     99edbbef0de23ac4ed20015ded60887690444661 and
>     d04d963a4efa93ecacaadc272ab841c1dd901c9f) and started the VMs.
>     Everything is up and running again, sorry about the incovenients
>     and thanks!!!
>
>
>
>     2014-05-27 17:35 GMT-03:00 Juan José Pavlik Salles
>     <jjpavlik at gmail.com <mailto:jjpavlik at gmail.com>>:
>
>         What if I change Image ID in glance DB for an existing image's
>         ID? As far as I see, if you delete an image you can't reboot
>         the instances that were created with that image, doesn't sound
>         fine. I must be loosing something here...
>
>
>         2014-05-27 16:56 GMT-03:00 Juan José Pavlik Salles
>         <jjpavlik at gmail.com <mailto:jjpavlik at gmail.com>>:
>
>             Great, now I understand that, new thing learned hahah! But
>             this problem doesn't seem to be related with the _base
>             files, the log says it couldn't found the Image file,
>             that's why I'm confused and don't see the point. I'll try
>             spying the code a bit, maybe it's a simple check and
>             there's no real need of the image file.
>
>
>             2014-05-27 16:29 GMT-03:00 George Shuklin
>             <george.shuklin at gmail.com <mailto:george.shuklin at gmail.com>>:
>
>                 _base contains 'base' copy of disk, if disk is in qcow
>                 format.
>
>                 Qcow consists from basic (unmodified) image and file
>                 with changes. If instance never write to some area, it
>                 will be read from base copy. As soon it write
>                 something there, new data will be read from disk, not
>                 from _base.
>
>
>
>                 On 05/27/2014 10:18 PM, Juan José Pavlik Salles wrote:
>>                 Hi George, I don't really understand the relationship
>>                 between _base and the
>>                 b17bfae2-27b4-49a4-9d1b-bd739b400347 (instance
>>                 directory, where the disks are), this is what _base
>>                 contains
>>
>>                 root at cebolla:/var/lib/nova/instances# ll _base/
>>                 total 2424832
>>                 drwxr-xr-x  2 nova nova       3896 may 27 15:23 ./
>>                 drwxr-xr-x 28 nova nova       3896 may 27 14:36 ../
>>                 -rw-r--r--  1 nova         kvm  2147483648 may 27
>>                 15:52 1cfaaa19259a9538efb89dd674645af7ad334322
>>                 -rw-r--r--  1 nova nova 8589934592 nov 18  2013 swap
>>                 -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15
>>                  2013 swap_512
>>                 root at cebolla:/var/lib/nova/instances#
>>
>>                 And I've checked glance DB and
>>                 the 39baad54-6ce1-4f42-b431-1bac4fd6df28 register is
>>                 indeed marked as deleted and the file is gone:
>>
>>                 root at acelga:/var/lib/glance# ls images
>>                 37a88684-f1d8-472a-8681-65eb047c2100
>>                  c94ee2f6-fae5-451c-9633-18c33ec512de
>>                  d21dd4db-389c-4f4c-a749-91acc1262652
>>                 root at acelga:/var/lib/glance#
>>
>>                 Is there any healthy way to start the instances
>>                 without this lost image? Do I really need the image
>>                 to start the instances?
>>
>>                 Thanks
>>
>>
>>                 2014-05-27 15:58 GMT-03:00 George Shuklin
>>                 <george.shuklin at gmail.com
>>                 <mailto:george.shuklin at gmail.com>>:
>>
>>                     I think nova checking if image is in place and
>>                     available to restore image _base (if it missing).
>>                     But if _base is fine, I think it's strange to
>>                     complain about glance images...
>>
>>
>>                     On 05/27/2014 09:32 PM, Juan José Pavlik Salles
>>                     wrote:
>>>                     Hi guys, today we found out that one of our
>>>                     compute nodes had rebooted durning the night, so
>>>                     when i got to the office I started rebooting the
>>>                     instances but... they never started. After a
>>>                     quite a few reboots I saw the light at the end
>>>                     of the tunnel...
>>>
>>>                     2014-05-27 15:23:45.002 ERROR
>>>                     nova.compute.manager
>>>                     [req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc
>>>                     31020076174943bdb7486c330a298d93
>>>                     d1e3aae242f14c488d2225dcbf1e96d6] [instance:
>>>                     b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot
>>>                     reboot instance: Image
>>>                     39baad54-6ce1-4f42-b431-1bac4fd6df28 could not
>>>                     be found.
>>>
>>>                     I've got 3 instances with this same error, all
>>>                     of them were created from the same glance image
>>>                     which is not longer among us (replaced for a new
>>>                     one). My question is, why do the instance need
>>>                     the image to start? The instance disks are there
>>>
>>>                     root at cebolla:/var/lib/nova# ll
>>>                     instances/b17bfae2-27b4-49a4-9d1b-bd739b400347/
>>>                     total 3233792
>>>                     drwxr-xr-x  2 nova nova     3896 feb 20 12:49 ./
>>>                     drwxr-xr-x 28 nova nova     3896 may 27 14:36 ../
>>>                     -rw-rw----  1 root root        0 may 27 15:23
>>>                     console.log
>>>                     -rw-r--r--  1 root root 2773155840 may 24 20:23 disk
>>>                     -rw-r--r--  1 root root  537198592 may 16 16:14
>>>                     disk.swap
>>>                     -rw-r--r--  1 nova nova     1782 may 27 15:23
>>>                     libvirt.xml
>>>                     root at cebolla:/var/lib/nova#
>>>
>>>                     Any ideas will be more than apreciated.
>>>
>>>                     Thanks guys!
>>>
>>>                     -- 
>>>                     Pavlik Salles Juan José
>>>                     Blog - http://viviendolared.blogspot.com
>>>
>>>
>>>                     _______________________________________________
>>>                     OpenStack-operators mailing list
>>>                     OpenStack-operators at lists.openstack.org  <mailto:OpenStack-operators at lists.openstack.org>
>>>                     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>                     _______________________________________________
>>                     OpenStack-operators mailing list
>>                     OpenStack-operators at lists.openstack.org
>>                     <mailto:OpenStack-operators at lists.openstack.org>
>>                     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>>
>>                 -- 
>>                 Pavlik Salles Juan José
>>                 Blog - http://viviendolared.blogspot.com
>
>
>
>
>             -- 
>             Pavlik Salles Juan José
>             Blog - http://viviendolared.blogspot.com
>
>
>
>
>         -- 
>         Pavlik Salles Juan José
>         Blog - http://viviendolared.blogspot.com
>
>
>
>
>     -- 
>     Pavlik Salles Juan José
>     Blog - http://viviendolared.blogspot.com
>
>
>
>
> -- 
> Pavlik Salles Juan José
> Blog - http://viviendolared.blogspot.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140528/bf3426bd/attachment-0001.html>


More information about the OpenStack-operators mailing list