[Openstack] Replication error

Clay Gerrard clay.gerrard at gmail.com
Mon Sep 23 19:34:22 UTC 2013


Run `swift-ring-builder /etc/swift/object.builder validate` - it should
have no errors and exit 0.  Can you provide a paste of the output from
`swift-ring-builder /etc/swift/object.builder` as well - it should list
some general info about the ring (number of replicas, and list of devices).
 Rebalance the ring and make sure it's been distributed to all nodes.

The particular line you're seeing pop up in the traceback seems to be
looking for all of the nodes for a particular partition it found in the
objects' dir.  I'm not seeing any local sanitization [1] around those top
level directory names, so maybe it's just some garbage that created there
outside of swift, or some file system corruption?

Can you provide the output from `ls /srv/node/objects` (or wherever you
have devices configured)

-Clay

1. https://bugs.launchpad.net/swift/+bug/1229372


On Mon, Sep 23, 2013 at 2:34 AM, Mike Preston <mike.preston at synety.com>wrote:

>  Hi, ****
>
> ** **
>
> We are seeing a replication error on swift. The error only is seen on a
> single node, the other nodes appear to be working fine.****
>
> Installed version is debian wheezy with swift 1.4.8-2+deb7u1
>
> ****
>
> Sep 23 10:33:03 storage-node-01 object-replicator Starting object
> replication pass.****
>
> Sep 23 10:33:03 storage-node-01 object-replicator Exception in top-level
> replication loop: #012Traceback (most recent call last):#012  File
> "/usr/lib/python2.7/dist-packages/swift/obj/replicator.py", line 564, in
> replicate#012    jobs = self.collect_jobs()#012  File
> "/usr/lib/python2.7/dist-packages/swift/obj/replicator.py", line 536, in
> collect_jobs#012    self.object_ring.get_part_nodes(int(partition))#012
> File "/usr/lib/python2.7/dist-packages/swift/common/ring/ring.py", line
> 103, in get_part_nodes#012    return [self.devs[r[part]] for r in
> self._replica2part2dev_id]#012IndexError: array index out of range****
>
> Sep 23 10:33:03 storage-node-01 object-replicator Nothing replicated for
> 0.728466033936 seconds.****
>
> Sep 23 10:33:03 storage-node-01 object-replicator Object replication
> complete. (0.01 minutes)
>
> ****
>
> Can anyone shed any light on this or next steps in debugging it or fixing
> it?****
>
> ** **
>
> ** **
>
> ** **
>
> *Mike Preston*
>
> Infrastructure Team  |  SYNETY****
>
> www.synety.com****
>
> ** **
>
> direct: 0116 424 4016****
>
> mobile: 07950 892038****
>
> main: 0116 424 4000****
>
> ** **
>
> ** **
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130923/9c36879e/attachment.html>


More information about the Openstack mailing list