Read Only FS after ceph issue

melanie witt melwittt at gmail.com
Mon Jan 21 20:49:27 UTC 2019


On Mon, 21 Jan 2019 17:40:52 +0000, Grant Morley 
<grant at absolutedevops.io> wrote:
> Hi all,
> 
> We are in the process of retiring one of our old platforms and last 
> night our ceph cluster went into an "Error" state briefly because 1 of 
> the OSDs went close to full. The data got re-balanced fine and the 
> health of ceph is now "OK" - however we have about 40% of our instances 
> that now have corrupt disks which is a bit odd.
> 
> Even more strange is that we cannot get them into rescue mode. As soon 
> as we try we the instances seem to hang during the bootup process when 
> they are trying to mount "/dev/vdb1" and we eventually get a kernel 
> timeout error as below:
> 
> Warning: fsck not present, so skipping root file system
> [    5.644526] EXT4-fs (vdb1): INFO: recovery required on readonly filesystem
> [    5.645583] EXT4-fs (vdb1): write access will be enabled during recovery
> [  240.504873] INFO: task exe:332 blocked for more than 120 seconds.
> [  240.506986]       Not tainted 4.4.0-66-generic #87-Ubuntu
> [  240.508782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  240.511438] exe             D ffff88003714b878     0   332      1 0x00000000
> [  240.513809]  ffff88003714b878 ffff88007c18e358 ffffffff81e11500 ffff88007be81c00
> [  240.516665]  ffff88003714c000 ffff88007fc16dc0 7fffffffffffffff ffffffff81838cd0
> [  240.519546]  ffff88003714b9d0 ffff88003714b890 ffffffff818384d5 0000000000000000
> [  240.522399] Call Trace:
> 
> I have even tried using a different image for nova rescue and we are 
> getting the same results. Has anyone come across this before?
> 
> This system is running OpenStack Mitaka with Ceph Jewel.
> 
> Any help or suggestions will be much appreciated.

I don't know whether this is related, but what you describe reminded me 
of issues I have seen before in the past:

https://bugs.launchpad.net/nova/+bug/1781878

See my comment #1 on the bug ^ for links to additional information on 
the same root issue.

Hope this helps in some way,
-melanie







More information about the openstack-discuss mailing list