[Swift] Object replication failures on newly upgraded servers

Mark Kirkwood mark.kirkwood at catalyst.net.nz
Fri May 28 04:58:10 UTC 2021


HI,

I'm in the process of upgrading a Swift cluster from 2.7/Mitaka to 
2.23/Train. While in general it seems to be going well, I'm noticing 
non-zero object replication failures on the upgraded nodes only, e.g:

$ curl http://localhost:6000/recon/replication/object
{"replication_last": 1622156911.019487, "replication_stats": {"rsync": 
40580, "success": 4141229, "attempted": 2081856, "remove": 4083, 
"suffix_count": 14960481, "failure": 26550, "hashmatch": 4127197, 
"failure_nodes": {"10.11.18.67": {"obj08": 2348, "obj09": 60, "obj10": 
3030, "obj02": 34, "obj03": 25, "obj01": 44, "obj06": 1498, "obj07": 28, 
"obj04": 69, "obj05": 36}, "10.11.18.68": {"obj03": 6901, "obj01": 293, 
"obj06": 1901, "obj04": 10281, "obj10": 1}, "10.12.18.76": {"obj10": 
1}}, "suffix_sync": 1785, "suffix_hash": 2778}, 
"object_replication_last": 1622156911.019487, "replication_time": 
1094.7836411476135, "object_replication_time": 1094.7836411476135}

Examining the logs (/var/log/swift/object.log and /var/log/syslog) these 
are not throwing up any red flags (i.e no failing rsyncs noted). Any 
suggesting about how to get more information about what went wrong e.g: 
"10.11.18.67": {"obj08": 2348}, how to find what those 2348 errors were?

regards

Mark

P.s: basic sanity checking is ok - uploaded objects go where they should 
and can be retrieved for 2.7 or 2.23 servers ok (the old and new version 
servers agree about object placement)




More information about the openstack-discuss mailing list