[openstack-operators] RBD problems after data center power outage
Dinçer Çelik
hello at dincercelik.com
Fri Oct 18 19:22:12 UTC 2019
Hi,
I've fixed the issue.
First, I would like to thank to all Ceph developers for making it bulletproof.
The root cause was "force_config_drive"[1] option of Nova that I had enabled few weeks ago. When you enable this option, Nova creates a new disk with the same name ending with ".config". The reason why I had enabled this option is, I am facing dhcp related issues sometimes.
Temporary disabling this option fixed the issue.
Regards
[1] https://docs.openstack.org/nova/stein/configuration/config.html#DEFAULT.force_config_drive
/* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */
/* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */
> On 18 Oct 2019, at 14:29, Eugen Block <eblock at nde.ag> wrote:
>
> I assumed the header was missing because of this message:
>
>> error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory
>
> If you can stat the header file can you share the output of
>
> rados -p vms listomapvals rbd_header.<BLOCK_PREFIX>
>
> Are there rbd_data objects left in the pool from that config drive?
>
> rados -p images ls | grep <BLOCK_PREFIX>
> rbd_object_map.1cbc666b8b4567
> rbd_data.1cbc666b8b4567.0000000000000000
> rbd_header.1cbc666b8b4567
>
> If yes, maybe there's a way to set things back together, which I haven't done yet. Are all affected VMs referring to a config drive and is it always the config drive object that's missing?
>
>
> Zitat von Dinçer Çelik <hello at dincercelik.com>:
>
>> Hi Eugen,
>>
>> I think this is not the same situation with I’m facing because I can get rbd headers.
>>
>> Regards
>>
>> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */
>>
>> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */
>>
>>> On 18 Oct 2019, at 09:44, Eugen Block <eblock at nde.ag> wrote:
>>>
>>> Hi,
>>>
>>> I've recently found this post [1] to recover a failing header, but I haven't tried it myself. I'm curios if it works though.
>>>
>>> Regards,
>>> Eugen
>>>
>>> https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/
>>>
>>>
>>> Zitat von Dinçer Çelik <hello at dincercelik.com>:
>>>
>>>> Greetings,
>>>>
>>>> Today I had a data center power outage, and the OpenStack cluster went down. After taking the cluster up again, I cannot start some VMs due to error below. I've tried "rbd object-map rebuild" but it didn't work. What's the proper way to re-create the missing "_disk.config" files?
>>>>
>>>> Thanks.
>>>>
>>>> [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z qemu-system-x86_64: -drive file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory
>>>>
>>>> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */
>>>>
>>>> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */
>>>
>>>
>>>
>>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191018/e512b072/attachment.html>
More information about the openstack-discuss
mailing list