[Openstack] [Swift] Drive failure detection and recovery using swift-drive-audit

Shrinand Javadekar shrinand at maginatics.com
Fri Dec 5 19:47:26 UTC 2014


The Openstack Swift admin guide talks about the swift-drive-audit tool
for detecting failed drives and unmounting then. It says that this
tool should be setup to run as periodic cronjob.

I have a question about configuring this correctly so as:

1) Detect failures and unmount the failed drive as soon as possible AND
2) Not miss any error message in the kern.log files AND
3) Not unmount a drive that has already been replaced with a new one.

If the swift-drive-audit config file has minutes = N, then the cron
job should also run every N minutes.

If it is less than N, the swift-drive-audit tool could potentially
unmount an already recovered drive.

If it is > N, it is possible to miss some messages in the log file.

Is the above analysis correct?

More information about the Openstack mailing list