[Openstack-operators] Can't reboot instance because the source Image was deleted - Grizzly - Ubuntu 12.04

Juan José Pavlik Salles jjpavlik at gmail.com
Tue May 27 21:25:09 UTC 2014


By the way... there's something we can learn from this miss understanding.
The error message:

2014-05-27 15:23:45.002 ERROR nova.compute.manager
[req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc 31020076174943bdb7486c330a298d93
d1e3aae242f14c488d2225dcbf1e96d6] [instance:
b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot reboot instance: Image
39baad54-6ce1-4f42-b431-1bac4fd6df28 could not be found.

is wrong and even useless because it points to the wrong direction. The
problem here wasn't glance image at all, there was a part of the qcow drive
missing. Should I fill a bug with this problem? I'm on fire today hahaha.


2014-05-27 18:19 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:

> I'm not proud of this... somehow George was right. Last week we migrated
> our instances from gfs2 volume to ocfs2 one and we copied "all" the files
> from one volume to the other, we mounted the new one and started the VMs.
> BUT... it seems that a few files were lost during the last node failure and
> the files that were supposed to be in _base dir weren't there (this is an
> awkward answer I'll have to improve before telling my boss about this). You
> can see it here:
>
>  root at cebolla:/var/lib/nova# ll instances/_base/
> instances_17_05_2014/_base/
> instances_17_05_2014/_base/:
> total 6572308
> drwxr-xr-x  2 nova         nova       4096 may 17 20:50 ./
> drwxr-xr-x 27 root         root       4096 may 17 20:57 ../
> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
> 1cfaaa19259a9538efb89dd674645af7ad334322
> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
> 6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd
> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
> 99edbbef0de23ac4ed20015ded60887690444661
> -rw-r--r--  1 nova         kvm  2147483648 may 17 20:50
> d04d963a4efa93ecacaadc272ab841c1dd901c9f
> -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>
> instances/_base/:
> total 2424832
> drwxr-xr-x  2 nova         nova       3896 may 27 18:02 ./
> drwxr-xr-x 28 nova         nova       3896 may 27 17:45 ../
> -rw-r--r--  1 nova         nova 2147483648 may 27 17:34
> 1cfaaa19259a9538efb89dd674645af7ad334322
> -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
> root at cebolla:/var/lib/nova#
>
> Before that I checked that the qcow disk of the instances were being
> backed up by a file that didn't exist at all!!!:
>
> root at cebolla:/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
> qemu-img info disk
> image: disk
> file format: qcow2
> virtual size: 10G (10737418240 bytes)
> disk size: 2.6G
> cluster_size: 65536
> backing file:
> */var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661*(actual path:
> /var/lib/nova/instances/_base/99edbbef0de23ac4ed20015ded60887690444661)
> root at cebolla:/var/lib/nova/instances/b17bfae2-27b4-49a4-9d1b-bd739b400347#
>
> Basically, I copied the missing files from the older volume
> (6a861f8328e7fd0b4bd80bf95dbb7fd2b782e0bd,
> 99edbbef0de23ac4ed20015ded60887690444661 and
> d04d963a4efa93ecacaadc272ab841c1dd901c9f) and started the VMs. Everything
> is up and running again, sorry about the incovenients and thanks!!!
>
>
>
> 2014-05-27 17:35 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:
>
> What if I change Image ID in glance DB for an existing image's ID? As far
>> as I see, if you delete an image you can't reboot the instances that were
>> created with that image, doesn't sound fine. I must be loosing something
>> here...
>>
>>
>> 2014-05-27 16:56 GMT-03:00 Juan José Pavlik Salles <jjpavlik at gmail.com>:
>>
>> Great, now I understand that, new thing learned hahah! But this problem
>>> doesn't seem to be related with the _base files, the log says it couldn't
>>> found the Image file, that's why I'm confused and don't see the point. I'll
>>> try spying the code a bit, maybe it's a simple check and there's no real
>>> need of the image file.
>>>
>>>
>>> 2014-05-27 16:29 GMT-03:00 George Shuklin <george.shuklin at gmail.com>:
>>>
>>>  _base contains 'base' copy of disk, if disk is in qcow format.
>>>>
>>>> Qcow consists from basic (unmodified) image and file with changes. If
>>>> instance never write to some area, it will be read from base copy. As soon
>>>> it write something there, new data will be read from disk, not from _base.
>>>>
>>>>
>>>>
>>>> On 05/27/2014 10:18 PM, Juan José Pavlik Salles wrote:
>>>>
>>>> Hi George, I don't really understand the relationship between _base and
>>>> the b17bfae2-27b4-49a4-9d1b-bd739b400347 (instance directory, where
>>>> the disks are), this is what _base contains
>>>>
>>>>  root at cebolla:/var/lib/nova/instances# ll _base/
>>>> total 2424832
>>>> drwxr-xr-x  2 nova         nova       3896 may 27 15:23 ./
>>>> drwxr-xr-x 28 nova         nova       3896 may 27 14:36 ../
>>>> -rw-r--r--  1 nova         kvm  2147483648 may 27 15:52
>>>> 1cfaaa19259a9538efb89dd674645af7ad334322
>>>> -rw-r--r--  1 nova         nova 8589934592 nov 18  2013 swap
>>>> -rw-r--r--  1 libvirt-qemu kvm   536870912 nov 15  2013 swap_512
>>>> root at cebolla:/var/lib/nova/instances#
>>>>
>>>>  And I've checked glance DB and
>>>> the 39baad54-6ce1-4f42-b431-1bac4fd6df28 register is indeed marked as
>>>> deleted and the file is gone:
>>>>
>>>>  root at acelga:/var/lib/glance# ls images
>>>> 37a88684-f1d8-472a-8681-65eb047c2100
>>>>  c94ee2f6-fae5-451c-9633-18c33ec512de  d21dd4db-389c-4f4c-a749-91acc1262652
>>>> root at acelga:/var/lib/glance#
>>>>
>>>>  Is there any healthy way to start the instances without this lost
>>>> image? Do I really need the image to start the instances?
>>>>
>>>>  Thanks
>>>>
>>>>
>>>> 2014-05-27 15:58 GMT-03:00 George Shuklin <george.shuklin at gmail.com>:
>>>>
>>>>>  I think nova checking if image is in place and available to restore
>>>>> image _base (if it missing). But if _base is fine, I think it's strange to
>>>>> complain about glance images...
>>>>>
>>>>>
>>>>> On 05/27/2014 09:32 PM, Juan José Pavlik Salles wrote:
>>>>>
>>>>>  Hi guys, today we found out that one of our compute nodes had
>>>>> rebooted durning the night, so when i got to the office I started rebooting
>>>>> the instances but... they never started. After a quite a few reboots I saw
>>>>> the light at the end of the tunnel...
>>>>>
>>>>>  2014-05-27 15:23:45.002 ERROR nova.compute.manager
>>>>> [req-a76d922e-4aaa-4357-83cb-5e5a1869b5cc 31020076174943bdb7486c330a298d93
>>>>> d1e3aae242f14c488d2225dcbf1e96d6] [instance:
>>>>> b17bfae2-27b4-49a4-9d1b-bd739b400347] Cannot reboot instance: Image
>>>>> 39baad54-6ce1-4f42-b431-1bac4fd6df28 could not be found.
>>>>>
>>>>>  I've got 3 instances with this same error, all of them were created
>>>>> from the same glance image which is not longer among us (replaced for a new
>>>>> one). My question is, why do the instance need the image to start? The
>>>>> instance disks are there
>>>>>
>>>>>    root at cebolla:/var/lib/nova# ll
>>>>> instances/b17bfae2-27b4-49a4-9d1b-bd739b400347/
>>>>> total 3233792
>>>>> drwxr-xr-x  2 nova nova       3896 feb 20 12:49 ./
>>>>> drwxr-xr-x 28 nova nova       3896 may 27 14:36 ../
>>>>> -rw-rw----  1 root root          0 may 27 15:23 console.log
>>>>> -rw-r--r--  1 root root 2773155840 may 24 20:23 disk
>>>>> -rw-r--r--  1 root root  537198592 may 16 16:14 disk.swap
>>>>> -rw-r--r--  1 nova nova       1782 may 27 15:23 libvirt.xml
>>>>> root at cebolla:/var/lib/nova#
>>>>>
>>>>>  Any ideas will be more than apreciated.
>>>>>
>>>>>  Thanks guys!
>>>>>
>>>>>  --
>>>>> Pavlik Salles Juan José
>>>>> Blog - http://viviendolared.blogspot.com
>>>>>
>>>>>
>>>>>  _______________________________________________
>>>>> OpenStack-operators mailing listOpenStack-operators at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-operators mailing list
>>>>> OpenStack-operators at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>>> Pavlik Salles Juan José
>>>> Blog - http://viviendolared.blogspot.com
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Pavlik Salles Juan José
>>> Blog - http://viviendolared.blogspot.com
>>>
>>
>>
>>
>> --
>> Pavlik Salles Juan José
>> Blog - http://viviendolared.blogspot.com
>>
>
>
>
> --
> Pavlik Salles Juan José
> Blog - http://viviendolared.blogspot.com
>



-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140527/abaed322/attachment.html>


More information about the OpenStack-operators mailing list