[Openstack-operators] Can't reboot instance because the source Image was deleted - Grizzly - Ubuntu 12.04

Juan José Pavlik Salles jjpavlik at gmail.com
Tue May 27 22:16:39 UTC 2014


Well... now you mention that "Nova is really smart and will attempt to
redownload image from glance if _base part is deleted." maybe that was
actually happing, but since the image had been deleted there was no way it
could get it back from glance and this would perfectly explain the ERROR
message. I really learned a few things today hahaha.


2014-05-27 19:07 GMT-03:00 George Shuklin <george.shuklin at gmail.com>:

>  Em... I think it's not bug. May be some kind of inconvenience,  and
> better error logging is always welcome, but error message is correct: image
> is not found. Not in glance, but inside of nova internals.
>
> For everyone, who know how nova works it perfectly clear. For new users is
> unclear, but this is 'wishlist', not a bug.
>
> P. S. As far as I hear, Nova is really smart and will attempt to
> redownload image from glance if _base part is deleted.
>
>
>
> On 05/28/2014 12:25 AM, Juan José Pavlik Salles wrote:
>
> By the way... there's something we can learn from this miss understanding.
> The error message:
>
>  2014-05-27 15:23:45.002 ERROR nova.compute.manager
> [req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc 31020076174943bdb7486c330a298d93
> d1e3aae242f14c488d2225dcbf1e96d6] [instance:
> b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot reboot instance: Image
> 39baad54-6ce1-4f42-b431-1bac4fd6df28 could not be found.
>
>  is wrong and even useless because it points to the wrong direction. The
> problem here wasn't glance image at all, there was a part of the qcow drive
> missing. Should I fill a bug with this problem? I'm on fire today hahaha.
>
>
> 2014-05-27 18:19 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:
>
>> I'm not proud of this... somehow George was right. Last week we migrated
>> our instances from gfs2 volume to ocfs2 one and we copied "all" the files
>> from one volume to the other, we mounted the new one and started the VMs.
>> BUT... it seems that a few files were lost during the last node failure and
>> the files that were supposed to be in _base dir weren't there (this is an
>> awkward answer I'll have to improve before telling my boss about this). You
>> can see it here:
>>
>>   root at cebolla:/var/lib/nova# ll instances/_base/
>> instances_17_05_2014/_base/
>> instances_17_05_2014/_base/:
>> total 6572308
>> drwxr-xr-x  2 nova         nova       4096 may 17 20:50 ./
>> drwxr-xr-x 27 root         root       4096 may 17 20:57 ../
>> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>> 1cfaaa19259a9538efb89dd674645af7ad334322
>> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>> 6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd
>> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>> 99edbbef0de23ac4ed20015ded60887690444661
>> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
>> d04d963a4efa93ecacaadc272ab841c1dd901c9f
>>  -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>>
>>  instances/_base/:
>> total 2424832
>> drwxr-xr-x  2 nova         nova       3896 may 27 18:02 ./
>>  drwxr-xr-x 28 nova         nova       3896 may 27 17:45 ../
>> -rw-r--r--  1 nova         nova 2147483648 may 27 17:34
>> 1cfaaa19259a9538efb89dd674645af7ad334322
>>  -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>>  root at cebolla:/var/lib/nova#
>>
>>  Before that I checked that the qcow disk of the instances were being
>> backed up by a file that didn't exist at all!!!:
>>
>>  root at cebolla:/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
>> qemu-img info disk
>> image: disk
>> file format: qcow2
>> virtual size: 10G (10737418240 bytes)
>> disk size: 2.6G
>> cluster_size: 65536
>> backing file:
>> */var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661*(actual path:
>> /var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661)
>> root at cebolla
>> :/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
>>
>>  Basically, I copied the missing files from the older volume
>> (6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd,
>> 99edbbef0de23ac4ed20015ded60887690444661 and
>> d04d963a4efa93ecacaadc272ab841c1dd901c9f) and started the VMs. Everything
>> is up and running again, sorry about the incovenients and thanks!!!
>>
>>
>>
>> 2014-05-27 17:35 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:
>>
>>  What if I change Image ID in glance DB for an existing image's ID? As
>>> far as I see, if you delete an image you can't reboot the instances that
>>> were created with that image, doesn't sound fine. I must be loosing
>>> something here...
>>>
>>>
>>> 2014-05-27 16:56 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:
>>>
>>>
>>>  Great, now I understand that, new thing learned hahah! But this
>>>> problem doesn't seem to be related with the _base files, the log says it
>>>> couldn't found the Image file, that's why I'm confused and don't see the
>>>> point. I'll try spying the code a bit, maybe it's a simple check and
>>>> there's no real need of the image file.
>>>>
>>>>
>>>> 2014-05-27 16:29 GMT-03:00 George Shuklin <george.shuklin at gmail.com>:
>>>>
>>>>  _base contains 'base' copy of disk, if disk is in qcow format.
>>>>>
>>>>> Qcow consists from basic (unmodified) image and file with changes. If
>>>>> instance never write to some area, it will be read from base copy. As soon
>>>>> it write something there, new data will be read from disk, not from _base.
>>>>>
>>>>>
>>>>>
>>>>> On 05/27/2014 10:18 PM, Juan José Pavlik Salles wrote:
>>>>>
>>>>> Hi George, I don't really understand the relationship between _base
>>>>> and the b17bfae2-27b4-49a4-9d1b-bd739b400347 (instance directory,
>>>>> where the disks are), this is what _base contains
>>>>>
>>>>>  root at cebolla:/var/lib/nova/instances# ll _base/
>>>>> total 2424832
>>>>> drwxr-xr-x  2 nova         nova       3896 may 27 15:23 ./
>>>>> drwxr-xr-x 28 nova         nova       3896 may 27 14:36 ../
>>>>> -rw-r--r--  1 nova         kvm  2147483648 may 27 15:52
>>>>> 1cfaaa19259a9538efb89dd674645af7ad334322
>>>>> -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>>>>> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>>>>> root at cebolla:/var/lib/nova/instances#
>>>>>
>>>>>  And I've checked glance DB and
>>>>> the 39baad54-6ce1-4f42-b431-1bac4fd6df28 register is indeed marked as
>>>>> deleted and the file is gone:
>>>>>
>>>>>  root at acelga:/var/lib/glance# ls images
>>>>> 37a88684-f1d8-472a-8681-65eb047c2100
>>>>>  c94ee2f6-fae5-451c-9633-18c33ec512de  d21dd4db-389c-4f4c-a749-91acc1262652
>>>>> root at acelga:/var/lib/glance#
>>>>>
>>>>>  Is there any healthy way to start the instances without this lost
>>>>> image? Do I really need the image to start the instances?
>>>>>
>>>>>  Thanks
>>>>>
>>>>>
>>>>> 2014-05-27 15:58 GMT-03:00 George Shuklin <george.shuklin at gmail.com>:
>>>>>
>>>>>>  I think nova checking if image is in place and available to restore
>>>>>> image _base (if it missing). But if _base is fine, I think it's strange to
>>>>>> complain about glance images...
>>>>>>
>>>>>>
>>>>>> On 05/27/2014 09:32 PM, Juan José Pavlik Salles wrote:
>>>>>>
>>>>>>  Hi guys, today we found out that one of our compute nodes had
>>>>>> rebooted durning the night, so when i got to the office I started rebooting
>>>>>> the instances but... they never started. After a quite a few reboots I saw
>>>>>> the light at the end of the tunnel...
>>>>>>
>>>>>>  2014-05-27 15:23:45.002 ERROR nova.compute.manager
>>>>>> [req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc 31020076174943bdb7486c330a298d93
>>>>>> d1e3aae242f14c488d2225dcbf1e96d6] [instance:
>>>>>> b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot reboot instance: Image
>>>>>> 39baad54-6ce1-4f42-b431-1bac4fd6df28 could not be found.
>>>>>>
>>>>>>  I've got 3 instances with this same error, all of them were created
>>>>>> from the same glance image which is not longer among us (replaced for a new
>>>>>> one). My question is, why do the instance need the image to start? The
>>>>>> instance disks are there
>>>>>>
>>>>>>    root at cebolla:/var/lib/nova# ll
>>>>>> instances/b17bfae2-27b4-49a4-9d1b-bd739b400347/
>>>>>> total 3233792
>>>>>> drwxr-xr-x  2 nova nova       3896 feb 20 12:49 ./
>>>>>> drwxr-xr-x 28 nova nova       3896 may 27 14:36 ../
>>>>>> -rw-rw----  1 root root          0 may 27 15:23 console.log
>>>>>> -rw-r--r--  1 root root 2773155840 may 24 20:23 disk
>>>>>> -rw-r--r--  1 root root  537198592 may 16 16:14 disk.swap
>>>>>> -rw-r--r--  1 nova nova       1782 may 27 15:23 libvirt.xml
>>>>>> root at cebolla:/var/lib/nova#
>>>>>>
>>>>>>  Any ideas will be more than apreciated.
>>>>>>
>>>>>>  Thanks guys!
>>>>>>
>>>>>>  --
>>>>>> Pavlik Salles Juan José
>>>>>> Blog - http://viviendolared.blogspot.com
>>>>>>
>>>>>>
>>>>>>  _______________________________________________
>>>>>> OpenStack-operators mailing listOpenStack-operators at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenStack-operators mailing list
>>>>>> OpenStack-operators at lists.openstack.org
>>>>>>
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> Pavlik Salles Juan José
>>>>> Blog - http://viviendolared.blogspot.com
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>>> Pavlik Salles Juan José
>>>> Blog - http://viviendolared.blogspot.com
>>>>
>>>
>>>
>>>
>>>  --
>>> Pavlik Salles Juan José
>>> Blog - http://viviendolared.blogspot.com
>>>
>>
>>
>>
>>  --
>> Pavlik Salles Juan José
>> Blog - http://viviendolared.blogspot.com
>>
>
>
>
>  --
> Pavlik Salles Juan José
> Blog - http://viviendolared.blogspot.com
>
>
>


-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140527/918f6f73/attachment.html>


More information about the OpenStack-operators mailing list