<div dir="ltr"><div>I have noticed that every single VM is impacted. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 17, 2023 at 3:14 PM Eugen Block <<a href="mailto:eblock@nde.ag">eblock@nde.ag</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Well, it’s the other way around: the compute nodes are the ones <br>
acquiring the locks as clients. If ceph goes down they can’t do <br>
anything with the locks until the cluster is reachable again, and <br>
sometimes a service restart is required, or a manual intervention as <br>
in this case. These things happen, the only thing that would help <br>
would probably be a stretched (or geo-redundant) ceph cluster to avoid <br>
a total failure so the cloud keeps working if one site goes down.<br>
Do you see the same impact on that many VMs or only on some of them? <br>
Or what does the last question refer to?<br>
<br>
Zitat von Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>>:<br>
<br>
> Hi Eugen,<br>
><br>
> I have a few questions before we close this thread.<br>
><br>
> - Is it normal that ceph locks images during power failure or disaster?<br>
> - Shouldn't ceph should release locks automatically when VMs shutdown?<br>
> - Is this a bug or natural behavior of ceph? I am worried what if i have<br>
> 100s of VMs and remove lock of all of them<br>
><br>
><br>
><br>
> On Fri, Feb 17, 2023 at 10:28 AM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>
><br>
>> Hi Eugen,<br>
>><br>
>> You saved my life!!!!!! all my vms up without any filesystem error :)<br>
>><br>
>> This is the correct command to remove the lock.<br>
>><br>
>> $ rbd lock rm -p vms ec6044e6-2231-4906-9e30-1e2e72573e64_disk "auto<br>
>> 139643345791728" client.1211875<br>
>><br>
>><br>
>><br>
>> On Fri, Feb 17, 2023 at 10:06 AM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>><br>
>> wrote:<br>
>><br>
>>> Hi Eugen,<br>
>>><br>
>>> I am playing with less important machine and i did following<br>
>>><br>
>>> I shutdown VM but still down following lock<br>
>>><br>
>>> root@ceph1:~# rbd lock list --image<br>
>>> ec6044e6-2231-4906-9e30-1e2e72573e64_disk -p vms<br>
>>> There is 1 exclusive lock on this image.<br>
>>> Locker ID Address<br>
>>> client.1211875 auto 139643345791728 <a href="http://192.168.3.12:0/2259335316" rel="noreferrer" target="_blank">192.168.3.12:0/2259335316</a><br>
>>><br>
>>> root@ceph1:~# ceph osd blacklist add <a href="http://192.168.3.12:0/2259335316" rel="noreferrer" target="_blank">192.168.3.12:0/2259335316</a><br>
>>> blocklisting <a href="http://192.168.3.12:0/2259335316" rel="noreferrer" target="_blank">192.168.3.12:0/2259335316</a> until<br>
>>> 2023-02-17T16:00:59.399775+0000 (3600 sec)<br>
>>><br>
>>> Still I can see it in the following lock list. Am I missing something?<br>
>>><br>
>>> root@ceph1:~# rbd lock list --image<br>
>>> ec6044e6-2231-4906-9e30-1e2e72573e64_disk -p vms<br>
>>> There is 1 exclusive lock on this image.<br>
>>> Locker ID Address<br>
>>> client.1211875 auto 139643345791728 <a href="http://192.168.3.12:0/2259335316" rel="noreferrer" target="_blank">192.168.3.12:0/2259335316</a><br>
>>><br>
>>><br>
>>><br>
>>> On Fri, Feb 17, 2023 at 2:39 AM Eugen Block <<a href="mailto:eblock@nde.ag" target="_blank">eblock@nde.ag</a>> wrote:<br>
>>><br>
>>>> The lock is aquired automatically, you don't need to create one. I'm<br>
>>>> curious why you have that many blacklist entries, maybe that is indeed<br>
>>>> the issue here (locks are not removed). I would shutdown the corrupted<br>
>>>> VM and see if the compute node still has a lock on that image, because<br>
>>>> after shutdown it should remove the lock (automatically). If there's<br>
>>>> still a watcher or lock on that image after shutdown (rbd status<br>
>>>> vms/55dbf40b-0a6a-4bab-b3a5-b4bb74e963af_disk) you can try to<br>
>>>> blacklist the client with:<br>
>>>><br>
>>>> # ceph osd blacklist add client.<ID><br>
>>>><br>
>>>> Then check the status again, if no watchers are present, boot the VM.<br>
>>>><br>
>>>><br>
>>>> Zitat von Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>>:<br>
>>>><br>
>>>> > Hi Eugen,<br>
>>>> ><br>
>>>> > This is what I did, let me know if I missed anything.<br>
>>>> ><br>
>>>> > root@ceph1:~# ceph osd blacklist ls<br>
>>>> > <a href="http://192.168.3.12:0/0" rel="noreferrer" target="_blank">192.168.3.12:0/0</a> 2023-02-17T04:48:54.381763+0000<br>
>>>> > <a href="http://192.168.3.22:0/753370860" rel="noreferrer" target="_blank">192.168.3.22:0/753370860</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.22:0/2833179066" rel="noreferrer" target="_blank">192.168.3.22:0/2833179066</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.22:0/1812968936" rel="noreferrer" target="_blank">192.168.3.22:0/1812968936</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.22:6824/2057987683" rel="noreferrer" target="_blank">192.168.3.22:6824/2057987683</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.21:0/2756666482" rel="noreferrer" target="_blank">192.168.3.21:0/2756666482</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > <a href="http://192.168.3.21:0/1646520197" rel="noreferrer" target="_blank">192.168.3.21:0/1646520197</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > <a href="http://192.168.3.22:6825/2057987683" rel="noreferrer" target="_blank">192.168.3.22:6825/2057987683</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.21:0/526748613" rel="noreferrer" target="_blank">192.168.3.21:0/526748613</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > <a href="http://192.168.3.21:6815/2454821797" rel="noreferrer" target="_blank">192.168.3.21:6815/2454821797</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > <a href="http://192.168.3.22:0/288537807" rel="noreferrer" target="_blank">192.168.3.22:0/288537807</a> 2023-02-17T04:47:08.185434+0000<br>
>>>> > <a href="http://192.168.3.21:0/4161448504" rel="noreferrer" target="_blank">192.168.3.21:0/4161448504</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > <a href="http://192.168.3.21:6824/2454821797" rel="noreferrer" target="_blank">192.168.3.21:6824/2454821797</a> 2023-02-17T05:16:23.939511+0000<br>
>>>> > listed 13 entries<br>
>>>> ><br>
>>>> > root@ceph1:~# rbd lock list --image<br>
>>>> > 55dbf40b-0a6a-4bab-b3a5-b4bb74e963af_disk -p vms<br>
>>>> > There is 1 exclusive lock on this image.<br>
>>>> > Locker ID Address<br>
>>>> > client.268212 auto 139971105131968 <a href="http://192.168.3.12:0/1649312807" rel="noreferrer" target="_blank">192.168.3.12:0/1649312807</a><br>
>>>> ><br>
>>>> > root@ceph1:~# ceph osd blacklist rm <a href="http://192.168.3.12:0/1649312807" rel="noreferrer" target="_blank">192.168.3.12:0/1649312807</a><br>
>>>> > <a href="http://192.168.3.12:0/1649312807" rel="noreferrer" target="_blank">192.168.3.12:0/1649312807</a> isn't blocklisted<br>
>>>> ><br>
>>>> > How do I create a lock?<br>
>>>> ><br>
>>>> ><br>
>>>> > On Thu, Feb 16, 2023 at 10:45 AM Eugen Block <<a href="mailto:eblock@nde.ag" target="_blank">eblock@nde.ag</a>> wrote:<br>
>>>> ><br>
>>>> >> In addition to Sean's response, this has been asked multiple times,<br>
>>>> >> e.g. [1]. You could check if your hypervisors gave up the lock on the<br>
>>>> >> RBDs or if they are still locked (rbd status <pool>/<image>), in that<br>
>>>> >> case you might need to blacklist the clients and see if that resolves<br>
>>>> >> anything. Do you have regular snapshots (or backups) to be able to<br>
>>>> >> rollback in case of a curruption?<br>
>>>> >><br>
>>>> >> [1] <a href="https://www.spinics.net/lists/ceph-users/msg45937.html" rel="noreferrer" target="_blank">https://www.spinics.net/lists/ceph-users/msg45937.html</a><br>
>>>> >><br>
>>>> >><br>
>>>> >> Zitat von Sean Mooney <<a href="mailto:smooney@redhat.com" target="_blank">smooney@redhat.com</a>>:<br>
>>>> >><br>
>>>> >> > On Thu, 2023-02-16 at 09:56 -0500, Satish Patel wrote:<br>
>>>> >> >> Folks,<br>
>>>> >> >><br>
>>>> >> >> I am running a small 3 node compute/controller with 3 node ceph<br>
>>>> storage<br>
>>>> >> in<br>
>>>> >> >> my lab. Yesterday, because of a power outage all my nodes went<br>
>>>> down.<br>
>>>> >> After<br>
>>>> >> >> reboot of all nodes ceph seems to show good health and no error<br>
>>>> (in ceph<br>
>>>> >> >> -s).<br>
>>>> >> >><br>
>>>> >> >> When I started using the existing VM I noticed the following<br>
>>>> errors.<br>
>>>> >> Seems<br>
>>>> >> >> like data loss. This is a lab machine and has zero activity on vms<br>
>>>> but<br>
>>>> >> >> still loses data and the file system corrupt. Is this normal ?<br>
>>>> >> > if the vm/cluster hard crashes due to the power cut yes it can.<br>
>>>> >> > personally i have hit this more often with XFS then ext4 but i have<br>
>>>> >> > seen it with both.<br>
>>>> >> >><br>
>>>> >> >> I am not using eraser coding, does that help in this matter?<br>
>>>> >> >><br>
>>>> >> >> blk_update_request: I/O error, dev sda, sector 233000 op 0x1:<br>
>>>> (WRITE)<br>
>>>> >> flags<br>
>>>> >> >> 0x800 phys_seg 8 prio class 0<br>
>>>> >> ><br>
>>>> >> > you will proably need to rescue the isntance and repair the<br>
>>>> >> > filesystem of each vm with fsck<br>
>>>> >> > or similar. so boot with recue image -> repair filestem -> unrescue<br>
>>>> >> > -> hardreboot/start vm if needed<br>
>>>> >> ><br>
>>>> >> > you might be able to mitigate this somewhat by disableing disk<br>
>>>> >> > cacheing at teh qemu level but<br>
>>>> >> > that will reduce performance. ceph recommenes that you use<br>
>>>> >> > virtio-scis fo the device model and<br>
>>>> >> > writeback cach mode. we generally recommend that too however you can<br>
>>>> >> > use the disk_cachemodes option to<br>
>>>> >> > chage that.<br>
>>>> >> ><br>
>>>> >><br>
>>>> <a href="https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.disk_cachemodes" rel="noreferrer" target="_blank">https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.disk_cachemodes</a><br>
>>>> >> ><br>
>>>> >> > [libvirt]<br>
>>>> >> > disk_cachemodes=file=none,block=none,network=none<br>
>>>> >> ><br>
>>>> >> > this curreption may also have happend on the cecph cluter side.<br>
>>>> >> > they have some options that can help prevent that via journaling<br>
>>>> wirtes<br>
>>>> >> ><br>
>>>> >> > if you can afford it i would get even a small UPS to allow a<br>
>>>> >> > graceful shutdown if you have future powercuts<br>
>>>> >> > to aovid dataloss issues.<br>
>>>> >><br>
>>>> >><br>
>>>> >><br>
>>>> >><br>
>>>> >><br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
<br>
<br>
<br>
</blockquote></div></div>