[Openstack] {Swift] Replication progress tracking

Clay Gerrard clay.gerrard at gmail.com
Tue Jan 17 01:49:02 UTC 2017

We track and prominently display the time since the last replication cycle
completed some minutes after a ring was deployed (the raw data is available
in recon data [1]) and also monitor counts of handoff partitions per device
(aggregated per node and cluster wide) [2].

You could also try to confirm you can observe the dreaded "Lockup
detected.. killing live coros" message [3] and perhaps take some
operational action based on that...


2. basically look on disk, compare to ring, say which ones are handoffs -
sample that every so often.  The "compare to ring, say which ones are
handoffs" part looks basically like this
https://gist.github.com/clayg/90143abc1c34e259752bf333f485a37e - the "look
on disk" and "sample that every so often" don't currently have prescriptive
implementations I can refer you to
3. https://bugs.launchpad.net/swift/+bug/1575277

On Mon, Jan 16, 2017 at 5:11 PM, Mark Kirkwood <
mark.kirkwood at catalyst.net.nz> wrote:

> Hi,
> We suffered a hung object replicator recently. In the process of sorting
> that out some question came to mind:
> 1/ Reliably determining if a replicator has hung (or just has nothing to
> do)
> 2/ Determining how behind replication is
> Now the output of swift-recon combined with the dispersion report
> certainly *suggest* that (say in case 1) there is work to do but nothing is
> happening. However is there a known way to determine that 'ok chaps the
> replicator has hung...'?
> Along the same lines the next question I'm being asked is about 2/ 'How
> behind/how much work is left for the replicator'? From previous reading of
> the code it looks like the replicator creates jobs (each of which is a
> partition + a set of suffixes) - so is there a way to poke the daemon and
> ask something like 'how many jobs do you have to go this run'?
> regards
> Mark
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac
> k
> Post to     : openstack at lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac
> k
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170116/460da709/attachment.html>

More information about the Openstack mailing list