[Openstack-operators] Ceph recovery going unusually slow

Grant Morley grantmorley1985 at gmail.com
Fri Jun 2 12:42:56 UTC 2017


We was just putting a security patch on that got released.

After that on one of the nodes we ran a service ceph-all restart and the
cluster went into error status.

What we have been doing to query the pgs so far is:

We get a list of the PGs from the output of ceph health detail

or we have used: ceph pg dump_stuck inactive

we then have a list of about 200 PGs and they are either peering or
remapped+peering or inactive.

When we query those we use: ceph pg PGNUM query

Sometimes this replies with something like:


https://pastebin.com/m3bH34RB


Or we get a huge output that seems to be a sequence of peering and then
remapping , then end of which looks like this:

https://pastebin.com/EwW5rftq

Or we get no reply at all and it just hangs forever.

So, we have OSDs that hang forever when contacted directly using: ceph pg
XXXX query.  If we look at the OSDs then we also have OSDs that are not
responding to queries such as: ceph tell osd.24 version - which also hangs
forever.  If we restart the OSD service it can reply and then hang again
forever.

We may have more than 1 problem.  1 OSDs hanging on queries such as a
simple: ceph tell osd.XX version

What causes that?

The other is the PGs that are not peering correctly, the NICs are all
correct, we tested the network connection and it is working and the ports
are open, but the peering process is not working between the OSDs for some
PGs and we have been unable to unstick it.

Thanks,


On Fri, Jun 2, 2017 at 1:18 PM, Tyanko Aleksiev <tyanko.alexiev at gmail.com>
wrote:

> Additionally, it could be useful to know what did you do during the
> maintenance.
>
> Cheers,
> Tyanko
>
> On 2 June 2017 at 14:08, Saverio Proto <zioproto at gmail.com> wrote:
>
>> To give you some help you need to tell us the ceph version you are
>> using and from ceph.conf in the section [osd] what values you have for
>> the following ?
>>
>> [osd]
>> osd max backfills
>> osd recovery max active
>> osd recovery op priority
>>
>> these three settings can influence the recovery speed.
>>
>> Also, do you have big enough limits ?
>>
>> Check on any host the content of: /proc/`pid_of_the_osd`/limits
>>
>>
>> Saverio
>>
>> 2017-06-02 14:00 GMT+02:00 Grant Morley <grantmorley1985 at gmail.com>:
>> > HEALTH_ERR 210 pgs are stuck inactive for more than 300 seconds; 296 pgs
>> > backfill_wait; 3 pgs backfilling; 1 pgs degraded; 202 pgs peering; 1 pgs
>> > recovery_wait; 1 pgs stuck degraded; 210 pgs stuck inactive; 510 pgs
>> stuck
>> > unclean; 3308 requests are blocked > 32 sec; 41 osds have slow requests;
>> > recovery 2/11091408 objects degraded (0.000%); recovery 1778127/11091408
>> > objects misplaced (16.032%); nodown,noout,noscrub,nodeep-scrub flag(s)
>> set
>> >
>> > pg 3.235 is stuck inactive for 138232.508429, current state peering,
>> last
>> > acting [11,26,1]
>> > pg 1.237 is stuck inactive for 138260.482588, current state peering,
>> last
>> > acting [8,41,34]
>> > pg 2.231 is stuck inactive for 138258.316031, current state peering,
>> last
>> > acting [24,53,8]
>> > pg 2.22e is stuck inactive for 194033.321591, current state
>> > remapped+peering, last acting [0,29,1]
>> > pg 1.22c is stuck inactive for 102514.200154, current state peering,
>> last
>> > acting [51,7,20]
>> > pg 2.228 is stuck inactive for 138258.317797, current state peering,
>> last
>> > acting [53,4,34]
>> > pg 1.227 is stuck inactive for 138258.244681, current state
>> > remapped+peering, last acting [48,35,11]
>> > pg 2.220 is stuck inactive for 193940.066322, current state
>> > remapped+peering, last acting [9,39,8]
>> > pg 1.222 is stuck inactive for 101474.087688, current state peering,
>> last
>> > acting [23,11,35]
>> > pg 3.130 is stuck inactive for 99735.451290, current state peering, last
>> > acting [27,37,17]
>> > pg 3.136 is stuck inactive for 138221.552865, current state peering,
>> last
>> > acting [26,49,10]
>> > pg 3.13c is stuck inactive for 137563.906503, current state peering,
>> last
>> > acting [51,53,7]
>> > pg 2.142 is stuck inactive for 99962.462932, current state peering, last
>> > acting [37,16,34]
>> > pg 1.141 is stuck inactive for 138257.572476, current state
>> > remapped+peering, last acting [5,17,49]
>> > pg 2.141 is stuck inactive for 102567.745720, current state peering,
>> last
>> > acting [36,7,15]
>> > pg 3.144 is stuck inactive for 138218.289585, current state
>> > remapped+peering, last acting [18,28,16]
>> > pg 1.14d is stuck inactive for 138260.030530, current state peering,
>> last
>> > acting [46,43,17]
>> > pg 3.155 is stuck inactive for 138227.368541, current state
>> > remapped+peering, last acting [33,20,52]
>> > pg 2.8d is stuck inactive for 100251.802576, current state peering, last
>> > acting [6,39,27]
>> > pg 2.15c is stuck inactive for 102567.512279, current state
>> > remapped+peering, last acting [7,35,49]
>> > pg 2.167 is stuck inactive for 138260.093367, current state peering,
>> last
>> > acting [35,23,17]
>> > pg 3.9d is stuck inactive for 117050.294600, current state peering, last
>> > acting [12,51,23]
>> > pg 2.16e is stuck inactive for 99846.214239, current state peering, last
>> > acting [25,5,8]
>> > pg 2.17b is stuck inactive for 99733.504794, current state peering, last
>> > acting [49,27,14]
>> > pg 3.178 is stuck inactive for 99973.600671, current state peering, last
>> > acting [29,16,40]
>> > pg 3.240 is stuck inactive for 28768.488851, current state
>> remapped+peering,
>> > last acting [33,8,32]
>> > pg 3.b6 is stuck inactive for 138222.461160, current state peering, last
>> > acting [26,29,34]
>> > pg 2.17e is stuck inactive for 159229.154401, current state peering,
>> last
>> > acting [13,42,48]
>> > pg 2.17c is stuck inactive for 104921.767401, current state
>> > remapped+peering, last acting [23,12,24]
>> > pg 3.17d is stuck inactive for 137563.979966, current state
>> > remapped+peering, last acting [43,24,29]
>> > pg 1.24b is stuck inactive for 93144.933177, current state peering, last
>> > acting [43,20,37]
>> > pg 1.bd is stuck inactive for 102616.793475, current state peering,
>> last
>> > acting [16,30,35]
>> > pg 3.1d6 is stuck inactive for 99974.485247, current state peering, last
>> > acting [16,38,29]
>> > pg 2.172 is stuck inactive for 193919.627310, current state inactive,
>> last
>> > acting [39,21,10]
>> > pg 1.171 is stuck inactive for 104947.558748, current state peering,
>> last
>> > acting [49,9,25]
>> > pg 1.243 is stuck inactive for 208452.393430, current state peering,
>> last
>> > acting [45,32,24]
>> > pg 3.aa is stuck inactive for 104958.230601, current state
>> remapped+peering,
>> > last acting [51,12,13]
>> >
>> > 41 osds have slow requests
>> > recovery 2/11091408 objects degraded (0.000%)
>> > recovery 1778127/11091408 objects misplaced (16.032%)
>> > nodown,noout,noscrub,nodeep-scrub flag(s) set
>> >
>> > That is what we seem to be getting  a lot of. It appears the PG's are
>> just
>> > stuck as inactive. I am not sure how to get around that.
>> >
>> > Thanks,
>> >
>> > On Fri, Jun 2, 2017 at 12:55 PM, Saverio Proto <zioproto at gmail.com>
>> wrote:
>> >>
>> >> Usually 'ceph health detail' gives better info on what is making
>> >> everything stuck.
>> >>
>> >> Saverio
>> >>
>> >> 2017-06-02 13:51 GMT+02:00 Grant Morley <grantmorley1985 at gmail.com>:
>> >> > Hi All,
>> >> >
>> >> > I wonder if anyone could help at all.
>> >> >
>> >> > We were doing some routine maintenance on our ceph cluster and after
>> >> > running
>> >> > a "service ceph-all restart" on one of our nodes we noticed that
>> >> > something
>> >> > wasn't quite right. The cluster has gone into an error mode and we
>> have
>> >> > multiple stuck PGs and the object replacement recovery is taking a
>> >> > strangely
>> >> > long time. At first there was about 46% objects misplaced and we now
>> >> > have
>> >> > roughly 16%.
>> >> >
>> >> > However it has taken about 36 hours to do the recovery so far and
>> with a
>> >> > possible 16 to go we are looking at a fairly major issue. As a lot of
>> >> > the
>> >> > system is now blocked for read / writes, customers cannot access
>> their
>> >> > VMs.
>> >> >
>> >> > I think the main issue at the moment is that we have 210pgs stuck
>> >> > inactive
>> >> > and nothing we seem to do can get them to peer.
>> >> >
>> >> > Below is an ouptut of the ceph status. Can anyone help or have any
>> ideas
>> >> > on
>> >> > how to speed up the recover process? We have tried turning down
>> logging
>> >> > on
>> >> > the OSD's but some are going so slow they wont allow us to injectargs
>> >> > into
>> >> > them.
>> >> >
>> >> > health HEALTH_ERR
>> >> >             210 pgs are stuck inactive for more than 300 seconds
>> >> >             298 pgs backfill_wait
>> >> >             3 pgs backfilling
>> >> >             1 pgs degraded
>> >> >             200 pgs peering
>> >> >             1 pgs recovery_wait
>> >> >             1 pgs stuck degraded
>> >> >             210 pgs stuck inactive
>> >> >             512 pgs stuck unclean
>> >> >             3310 requests are blocked > 32 sec
>> >> >             recovery 2/11094405 objects degraded (0.000%)
>> >> >             recovery 1785063/11094405 objects misplaced (16.090%)
>> >> >             nodown,noout,noscrub,nodeep-scrub flag(s) set
>> >> >
>> >> >             election epoch 16314, quorum 0,1,2,3,4,5,6,7,8
>> >> >
>> >> > storage-1,storage-2,storage-3,storage-4,storage-5,storage-6,
>> storage-7,storage-8,storage-9
>> >> >      osdmap e213164: 54 osds: 54 up, 54 in; 329 remapped pgs
>> >> >             flags nodown,noout,noscrub,nodeep-scrub
>> >> >       pgmap v41030942: 2036 pgs, 14 pools, 14183 GB data, 3309
>> kobjects
>> >> >             43356 GB used, 47141 GB / 90498 GB avail
>> >> >             2/11094405 objects degraded (0.000%)
>> >> >             1785063/11094405 objects misplaced (16.090%)
>> >> >                 1524 active+clean
>> >> >                  298 active+remapped+wait_backfill
>> >> >                  153 peering
>> >> >                   47 remapped+peering
>> >> >                   10 inactive
>> >> >                    3 active+remapped+backfilling
>> >> >                    1 active+recovery_wait+degraded+remapped
>> >> >
>> >> > Many thanks,
>> >> >
>> >> > Grant
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > OpenStack-operators mailing list
>> >> > OpenStack-operators at lists.openstack.org
>> >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac
>> k-operators
>> >> >
>> >
>> >
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170602/425dbf54/attachment.html>


More information about the OpenStack-operators mailing list