Folks,<div><br></div><div>This is the 3rd day and I see no or very little (kb.s) change with the new disks.</div><div><br></div><div>Could it be normal, is there a long computation process that takes time first before actually filling newly added disks?</div>

<div><br></div><div>Or should I just start from scratch with the "create" command this time. The last time I did it, I didn't use the "<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13.333333969116211px;background-color:rgb(255,255,255)">swift-ring-builder create 20 3 1 .." command first but just started with "</span><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13.333333969116211px;background-color:rgb(255,255,255)">swift-ring-builder add ..." and used existing ring.gz files, thinking otherwise I could be reformatting the whole stack. I'm not sure if that's the case.</span></div>

<div><font color="#222222" face="arial, sans-serif"><br></font></div><div><font color="#222222" face="arial, sans-serif">Please advise. Thanks,</font></div><div><font color="#222222" face="arial, sans-serif"><br></font></div>

<div><font color="#222222" face="arial, sans-serif">--</font></div><div><font color="#222222" face="arial, sans-serif">Emre<br></font><br><div class="gmail_quote">On Mon, Oct 22, 2012 at 12:09 PM, Emre Sokullu <span dir="ltr"><<a href="mailto:emre@groups-inc.com" target="_blank">emre@groups-inc.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Samuel,<br>

<br>

Thanks for quick reply.<br>

<br>

They're all 100. And here's the output of swift-ring-builder<br>

<br>

root@proxy1:/etc/swift# swift-ring-builder account.builder<br>

account.builder, build version 13<br>

1048576 partitions, 3 replicas, 3 zones, 12 devices, 0.00 balance<br>

The minimum number of hours before a partition can be reassigned is 1<br>

Devices:    id  zone      ip address  port      name weight partitions<br>

balance meta<br>

             0     1     192.168.1.3  6002    c0d1p1 100.00     262144    0.00<br>

             1     1     192.168.1.3  6002    c0d2p1 100.00     262144    0.00<br>

             2     1     192.168.1.3  6002    c0d3p1 100.00     262144    0.00<br>

             3     2     192.168.1.4  6002    c0d1p1 100.00     262144    0.00<br>

             4     2     192.168.1.4  6002    c0d2p1 100.00     262144    0.00<br>

             5     2     192.168.1.4  6002    c0d3p1 100.00     262144    0.00<br>

             6     3     192.168.1.5  6002    c0d1p1 100.00     262144    0.00<br>

             7     3     192.168.1.5  6002    c0d2p1 100.00     262144    0.00<br>

             8     3     192.168.1.5  6002    c0d3p1 100.00     262144    0.00<br>

             9     1     192.168.1.3  6002    c0d4p1 100.00     262144    0.00<br>

            10     2     192.168.1.4  6002    c0d4p1 100.00     262144    0.00<br>

            11     3     192.168.1.5  6002    c0d4p1 100.00     262144    0.00<br>

<div class="HOEnZb"><div class="h5"><br>

On Mon, Oct 22, 2012 at 12:03 PM, Samuel Merritt <<a href="mailto:sam@swiftstack.com">sam@swiftstack.com</a>> wrote:<br>

> On 10/22/12 9:38 AM, Emre Sokullu wrote:<br>

>><br>

>> Hi folks,<br>

>><br>

>> At <a href="http://GROU.PS" target="_blank">GROU.PS</a>, we've been an OpenStack SWIFT user for more than 1.5 years<br>

>> now. Currently, we hold about 18TB of data on 3 storage nodes. Since<br>

>> we hit 84% in utilization, we have recently decided to expand the<br>

>> storage with more disks.<br>

>><br>

>> In order to do that, after creating a new c0d4p1 partition in each of<br>

>> the storage nodes, we ran the following commands on our proxy server:<br>

>><br>

>> swift-ring-builder account.builder add z1-192.168.1.3:6002/c0d4p1 100<br>

>> swift-ring-builder container.builder add z1-192.168.1.3:6002/c0d4p1 100<br>

>> swift-ring-builder object.builder add z1-192.168.1.3:6002/c0d4p1 100<br>

>> swift-ring-builder account.builder add z2-192.168.1.4:6002/c0d4p1 100<br>

>> swift-ring-builder container.builder add z2-192.168.1.4:6002/c0d4p1 100<br>

>> swift-ring-builder object.builder add z2-192.168.1.4:6002/c0d4p1 100<br>

>> swift-ring-builder account.builder add z3-192.168.1.5:6002/c0d4p1 100<br>

>> swift-ring-builder container.builder add z3-192.168.1.5:6002/c0d4p1 100<br>

>> swift-ring-builder object.builder add z3-192.168.1.5:6002/c0d4p1 100<br>

>><br>

>> [snip]<br>

><br>

>><br>

>> So right now, the problem is;  the disk growth in each of the storage<br>

>> nodes seems to have stalled,<br>

><br>

> So you've added 3 new devices to each ring and assigned a weight of  100 to<br>

> each one. What are the weights of the other devices in the ring? If they're<br>

> much larger than 100, then that will cause the new devices to end up with a<br>

> small fraction of the data you want on them.<br>

><br>

> Running "swift-ring-builder <thing>.builder" will show you information,<br>

> including weights, of all the devices in the ring.<br>

><br>

><br>

><br>

>> * Bonus question: why do we copy ring.gz files to storage nodes and<br>

>> how critical they are. To me it's not clear how Swift can afford to<br>

>> wait (even though it's just a few seconds ) for .ring.gz files to be<br>

>> in storage nodes after rebalancing- if those files are so critical.<br>

><br>

><br>

> The ring.gz files contain the mapping from Swift partitions to disks. As you<br>

> know, the proxy server uses it to determine which backends have the data for<br>

> a given request. The replicators also use the ring to determine where data<br>

> belongs so that they can ensure the right number of replicas, etc.<br>

><br>

> When two storage nodes have different versions of a ring.gz file, you can<br>

> get replicator fights. They look like this:<br>

><br>

> - node1's (old) ring says that the partition for a replica of /cof/fee/cup<br>

> belongs on node2's /dev/sdf.<br>

> - node2's (new) ring says that the same partition belongs on node1's<br>

> /dev/sdd.<br>

><br>

> When the replicator on node1 runs, it will see that it has the partition for<br>

> /cof/fee/cup on its disk. It will then consult the ring, push that<br>

> partition's contents to node2, and then delete its local copy (since node1's<br>

> ring says that this data does not belong on node1).<br>

><br>

> When the replicator on node2 runs, it will do the converse: push to node1,<br>

> then delete its local copy.<br>

><br>

> If you leave the rings out of sync for a long time, then you'll end up<br>

> consuming disk and network IO ping-ponging a set of data around. If they're<br>

> out of sync for a few seconds, then it's not a big deal.<br>

><br>

> _______________________________________________<br>

> Mailing list: <a href="https://launchpad.net/~openstack" target="_blank">https://launchpad.net/~openstack</a><br>

> Post to     : <a href="mailto:openstack@lists.launchpad.net">openstack@lists.launchpad.net</a><br>

> Unsubscribe : <a href="https://launchpad.net/~openstack" target="_blank">https://launchpad.net/~openstack</a><br>

> More help   : <a href="https://help.launchpad.net/ListHelp" target="_blank">https://help.launchpad.net/ListHelp</a><br>

</div></div></blockquote></div><br></div>