[Openstack] [Swift] Drive failure detection and recovery using swift-drive-audit

Shrinand Javadekar shrinand at maginatics.com
Mon Dec 8 18:36:00 UTC 2014

Thanks Clay!

On Fri, Dec 5, 2014 at 1:02 PM, Clay Gerrard <clay.gerrard at gmail.com> wrote:
> On Fri, Dec 5, 2014 at 11:47 AM, Shrinand Javadekar
> <shrinand at maginatics.com> wrote:
>> If it is less than N, the swift-drive-audit tool could potentially
>> unmount an already recovered drive.
>> If it is > N, it is possible to miss some messages in the log file.
>> Is the above analysis correct?
> You're probably not too far off, but maybe in practice the frequency and
> depth off the lookback is still lower than the minimum amount of time a dc
> tech can physically walk out to a server and swap out a failing disk that
> gets unmounted?
> Once it's unmounted it stops generating errors, so maybe it's safer to pick
> a frequency that's generally lower then the cycle time on a drive swap and
> worst case you risk a replaced drive getting unmounted again for old errors
> when the dc techs are super on the ball for some reason.  At least an extra
> unmount can be fixed remotely.
> -Clay

More information about the Openstack mailing list