Open Stack

Fri May 28 04:58:10 UTC 2021

HI,

I'm in the process of upgrading a Swift cluster from 2.7/Mitaka to 
2.23/Train. While in general it seems to be going well, I'm noticing 
non-zero object replication failures on the upgraded nodes only, e.g:

$ curl http://localhost:6000/recon/replication/object
{"replication_last": 1622156911.019487, "replication_stats": {"rsync": 
40580, "success": 4141229, "attempted": 2081856, "remove": 4083, 
"suffix_count": 14960481, "failure": 26550, "hashmatch": 4127197, 
"failure_nodes": {"10.11.18.67": {"obj08": 2348, "obj09": 60, "obj10": 
3030, "obj02": 34, "obj03": 25, "obj01": 44, "obj06": 1498, "obj07": 28, 
"obj04": 69, "obj05": 36}, "10.11.18.68": {"obj03": 6901, "obj01": 293, 
"obj06": 1901, "obj04": 10281, "obj10": 1}, "10.12.18.76": {"obj10": 
1}}, "suffix_sync": 1785, "suffix_hash": 2778}, 
"object_replication_last": 1622156911.019487, "replication_time": 
1094.7836411476135, "object_replication_time": 1094.7836411476135}

Examining the logs (/var/log/swift/object.log and /var/log/syslog) these 
are not throwing up any red flags (i.e no failing rsyncs noted). Any 
suggesting about how to get more information about what went wrong e.g: 
"10.11.18.67": {"obj08": 2348}, how to find what those 2348 errors were?

regards

Mark

P.s: basic sanity checking is ok - uploaded objects go where they should 
and can be retrieved for 2.7 or 2.23 servers ok (the old and new version 
servers agree about object placement)

Open Stack

[Swift] Object replication failures on newly upgraded servers

OpenStack

Community

Documentation

Branding & Legal