<div dir="ltr">Hi folks,<div><br></div><div>This is Emre from <a href="http://GROU.PS">GROU.PS</a> -- we operate an OpenStack Swift cluster since 2011, it's been great.</div><div><br></div><div>We have a standard installation with a single proxy server (proxy1) and three storage servers (storage1, storage2, storage3) each with 5x1TB disks.</div>
<div><br></div><div>Following a chain of mistakes initiated by our hosting provider, which changed the wrong disk on one of our OpenStack Swift storage servers, we ended up with the following situation:</div><div>
<p class="">root@proxy1:/etc/swift# swift-ring-builder container.builder <br>container.builder, build version 41<br>1048576 partitions, 3 replicas, 3 zones, 14 devices, 60.00 balance<br>The minimum number of hours before a partition can be reassigned is 1<br>
Devices: id zone ip address port name weight partitions balance meta<br> 0 1 192.168.1.3 6001 c0d1p1 80.00 262144 37.50 <br> 1 1 192.168.1.3 6001 c0d2p1 80.00 262144 37.50 <br>
2 1 192.168.1.3 6001 c0d3p1 80.00 262144 37.50 <br> 3 2 192.168.1.4 6001 c0d1p1 100.00 238312 -0.00 <br> 4 2 192.168.1.4 6001 c0d2p1 100.00 238312 -0.00 <br>
5 2 192.168.1.4 6001 c0d3p1 100.00 238312 -0.00 <br> 6 3 192.168.1.5 6001 c0d1p1 100.00 209715 -12.00 <br> 7 3 192.168.1.5 6001 c0d2p1 100.00 209715 -12.00 <br>
8 3 192.168.1.5 6001 c0d3p1 100.00 209715 -12.00 <br> 10 2 192.168.1.4 6001 c0d5p1 100.00 238312 -0.00 <br> 11 3 192.168.1.5 6001 c0d5p1 100.00 209716 -12.00 <br>
14 3 192.168.1.5 6001 c0d6p1 100.00 209715 -12.00 <br> 15 1 192.168.1.3 6001 c0d5p1 80.00 262144 37.50 <br> 16 2 192.168.1.4 6001 c0d6p1 100.00 95328 -60.00</p>
<p class=""><br></p><p class="">root@proxy1:/etc/swift# ssh storage1 df -h<br>Filesystem Size Used Avail Use% Mounted on<br>/dev/cciss/c0d0p5 1.8T 38G 1.7T 3% /<br>none 3.9G 220K 3.9G 1% /dev<br>
none 4.0G 0 4.0G 0% /dev/shm<br>none 4.0G 60K 4.0G 1% /var/run<br>none 4.0G 0 4.0G 0% /var/lock<br>none 4.0G 0 4.0G 0% /lib/init/rw<br>
/dev/cciss/c0d1p1 1.9T 1.9T 239M 100% /srv/node/c0d1p1<br>/dev/cciss/c0d2p1 1.9T 1.9T 210M 100% /srv/node/c0d2p1<br>/dev/cciss/c0d3p1 1.9T 1.9T 104K 100% /srv/node/c0d3p1<br>/dev/cciss/c0d5p1 1.9T 1.2T 643G 66% /srv/node/c0d5p1<br>
/dev/cciss/c0d0p2 92M 51M 37M 59% /boot<br>/dev/cciss/c0d0p3 1.9G 35M 1.8G 2% /tmp</p><p class=""><br></p><p class="">root@proxy1:/etc/swift# ssh storage2 df -h<br>Filesystem Size Used Avail Use% Mounted on<br>
/dev/cciss/c0d0p5 1.8T 33G 1.7T 2% /<br>none 3.9G 220K 3.9G 1% /dev<br>none 4.0G 0 4.0G 0% /dev/shm<br>none 4.0G 108K 4.0G 1% /var/run<br>none 4.0G 0 4.0G 0% /var/lock<br>
none 4.0G 0 4.0G 0% /lib/init/rw<br>/dev/cciss/c0d0p3 1.9G 35M 1.8G 2% /tmp<br>/dev/cciss/c0d0p2 92M 51M 37M 59% /boot<br>/dev/cciss/c0d1p1 1.9T 1.5T 375G 80% /srv/node/c0d1p1<br>
/dev/cciss/c0d2p1 1.9T 1.5T 385G 80% /srv/node/c0d2p1<br>/dev/cciss/c0d3p1 1.9T 1.5T 382G 80% /srv/node/c0d3p1<br>/dev/cciss/c0d4p1 1.9T 1.5T 377G 80% /srv/node/c0d5p1<br>/dev/cciss/c0d5p1 1.9T 519G 1.4T 28% /srv/node/c0d6p1</p>
<p class=""><br></p><p class="">root@proxy1:/etc/swift# ssh storage3 df -h<br>Filesystem Size Used Avail Use% Mounted on<br>/dev/cciss/c0d0p5 1.8T 90G 1.7T 6% /<br>none 3.9G 224K 3.9G 1% /dev<br>
none 4.0G 0 4.0G 0% /dev/shm<br>none 4.0G 112K 4.0G 1% /var/run<br>none 4.0G 0 4.0G 0% /var/lock<br>none 4.0G 0 4.0G 0% /lib/init/rw<br>
/dev/cciss/c0d1p1 1.9T 1.1T 741G 61% /srv/node/c0d1p1<br>/dev/cciss/c0d2p1 1.9T 1.1T 741G 61% /srv/node/c0d2p1<br>/dev/cciss/c0d3p1 1.9T 1.1T 758G 60% /srv/node/c0d3p1<br>/dev/cciss/c0d5p1 1.9T 1.1T 765G 59% /srv/node/c0d5p1<br>
/dev/cciss/c0d6p1 1.9T 1.1T 772G 59% /srv/node/c0d6p1<br>/dev/cciss/c0d0p2 92M 51M 37M 59% /boot<br>/dev/cciss/c0d0p3 1.9G 35M 1.8G 2% /tmp</p><div><br></div><p class="">As you can see:</p><p class="">
* Balances are messed up and they don't get to a normal state no matter how long we wait. Although the behavior for the end-user is still stable.</p><p class="">* We tried erasing the contents a disk on storage1 (/dev/cciss/c0d5p1) that was 100% full before all others (others were still 95%) and this disk filled up pretty quickly, while others quickly catching up to 100%. We were expecting each to balance to the same level because storage2 and storage3 (with 5 disks each) are set to 100 in weight, whereas storage1 (with 4 disks only) is set to 80 in weight.</p>
<p class="">* There was a failing disk with storage2, so we replaced that (/dev/cciss/c0d5p1) but it is not filling up as quickly. Storage2 is almost 80%</p><p class="">* Storage3 is healthy.</p><p class="">* Storage1 is currently taken offline. Because it's been failing constantly and its disk space doesn't balance.</p>
<p class=""><br></p><p class="">What is the best course of action to take in this scenario. I believe, we can either:</p><p class="">1) Completely dump storage1. Remove zone-1 from the proxy. Get a new server with similar setup and add it as a new zone on proxy accordingly. </p>
<p class="">2) Stop storage1. Erase the contents of full disks on storage1. Switch to proxy, remove the full disks from the cluster, then add them as new devices. (again, a delay may </p><p class="">3) Or something completely different?</p>
<p class=""><br></p><p class="">My fear is, with both the first and second alternatives, if there's a delay between removing the zones or disks, and adding new ones, the other zones/disks would fill up. Therefore I would need to choose the alternative where there would be the minimal amount of delay.</p>
<p class="">Last but not least, please note that this is swift installation is outdated, never been updated since installation. (I am to blame!)</p><p class="">Thanks for your suggestions, directions in advance.</p><p class="">
Cheers,</p></div><div><br clear="all"><div><br></div>-- <br><div dir="ltr"><span style="border-collapse:collapse;color:rgb(136,136,136)"><font face="courier new, monospace"><font color="#000000">Emre</font></font></span></div>
<font face="yw-aa4fb83139ac691cb05d5cfd95da4950bbf70a7d-b525b4f57053e9faff67a45fe23fc468--ol" style></font></div></div>