[Openstack] Assets down
Samuel Merritt
sam at swiftstack.com
Fri Oct 11 21:56:54 UTC 2013
On 10/11/13 1:43 AM, Emre Sokullu wrote:
> Hi everyone,
>
> We're having a serious problem with our OpenStack installation. Most
> assets that we've been serving from http://assets00.grou.ps
> <http://assets00.grou.ps/> such as
> http://assets.grou.ps/0F2E3C/avatars/28346976/80.png are broken, but
> some still remain intact:
> http://assets00.grou.ps/0F2E3C/userimages/gelistirme/20130830032959-ykqmvcnxaulxhcbwt-image50.png
>
> Here's what happened:
>
> A few days ago, one of our servers crashed. The host (leaseweb) while
> rebooting the server for us (KVM access was broken as well) said that
> one of the disks was not being used and asked whether they should remove
> it. We said yes, do it. They did it and everything was fine until 12
> hours ago.
>
> I got a phone call saying all assets are broken. I checked the servers
> and saw that in 2 of our storage servers (we have 3 of them, each have 5
> disks) -- the ones that they rebooted that day -- some of our disks were
> renamed, so old c0d5p1 and c0d6p1 were gone and now we had a c0d4p1 and
> c0d5p1. And in both servers c0d5p1 were full although they were supposed
> to be at 65%
>
> My theory is, after they removed the disks, the disk names were
> reassigned and they changed fstab accordingly, but since the new naming
> didn't match our old configuration things went wrong. And perhaps most
> importantly, since c0d5p1 did exist before too, the system tried to
> recreate a partition so wrote things twice and that's why c0d5p1 was
> full but not others (c0d1p1, c0d2p1, c0d3p1 ..)
Yes, that's probably what happened. The device name in the ring is
simply the mount point; if you mount /dev/c0d3p1 at /srv/node/bob,
you'll add it to the ring as "bob". Thus, replicators on other nodes
were copying things onto c0d5p1, but there was only one replicator
draining data from it, so it filled up.
For future reference, if device names change, try to preserve the old
mount points. That'll stop Swift from becoming confused.
> So I tried to fix this problem by unmounting c0d5p1 and c0d4p1 from
> these two servers and removing from the ring configuration. But things
> are still the same. Here's the message I receive when swift-ring-builder
> to remove and rebalance account/object/container.builder
>
> root at proxy1:/etc/swift# swift-ring-builder account.builder rebalance
> Reassigned 891289 (85.00%) partitions. Balance is now 33.33.
> -------------------------------------------------------------------------------
> NOTE: Balance of 33.33 indicates you should push this
> ring, wait at least 1 hours, and rebalance/repush.
> -------------------------------------------------------------------------------
> root at proxy1:/etc/swift# swift-ring-builder account.builder
> account.builder, build version 33
> 1048576 partitions, 3 replicas, 3 zones, 12 devices, 33.33 balance
> The minimum number of hours before a partition can be reassigned is 1
> Devices: id zone ip address port name weight partitions
> balance meta
> 0 1 192.168.1.3 6002 c0d1p1 100.00 349525
> 33.33
> 1 1 192.168.1.3 6002 c0d2p1 100.00 349525
> 33.33
> 2 1 192.168.1.3 6002 c0d3p1 100.00 349526
> 33.33
> 3 2 192.168.1.4 6002 c0d1p1 100.00 262144
> 0.00
> 4 2 192.168.1.4 6002 c0d2p1 100.00 262144
> 0.00
> 5 2 192.168.1.4 6002 c0d3p1 100.00 262144
> 0.00
> 6 3 192.168.1.5 6002 c0d1p1 100.00 209715
> -20.00
> 7 3 192.168.1.5 6002 c0d2p1 100.00 209716
> -20.00
> 8 3 192.168.1.5 6002 c0d3p1 100.00 209715
> -20.00
> 10 2 192.168.1.4 6002 c0d5p1 100.00 262144
> 0.00
> 11 3 192.168.1.5 6002 c0d5p1 100.00 209715
> -20.00
> 14 3 192.168.1.5 6002 c0d6p1 100.00 209715
> -20.00
>
> So I tried it once again 15 minutes ago and here's what I get:
>
> root at proxy1:/etc/swift# swift-ring-builder account.builder rebalance
> Cowardly refusing to save rebalance as it did not change at least 1%.
You've got a 3-zone, 3-replica layout. Swift's placement algorithm will
put one replica in each zone regardless of device weights (thus
maximizing durability at the expense of balanced utilization). This ring
balance, though somewhat uneven, is the best Swift can do under the
circumstances, and that's what swift-ring-builder is telling you.
Ultimately, you lost some disks in 2 of your 3 zones, so 1 replica of
all your data should still be present in the cluster. I'd guess that the
reason for your loss of availability is that the rebalance moved the one
remaining replica of some of your objects, so the proxy server can't
find any replicas to service requests. After a complete replication pass
(assuming no further failures), your objects should return to full
availability.
More information about the Openstack
mailing list