<div dir="ltr"><div dir="ltr">Not scary!<div><br><div>Because you have a 15/4 EC policy, we say each partition has 19 "replicas".  And since rebalance will only move one "replica" of any partition max at each rebalance: up to 100% of your partitions may have at least one replica assignment move.</div><div><br></div><div>That means, after you push out this ring, 100% of your object GET requests will experience at most one "replica" is out of place.  But that's ok!  In a 15/4 you only need 15 EC fragments to respond successfully and you have 18 total fragments that did NOT get reassigned.</div><div><br></div><div>It's unfortunate the language is a little ambiguous, but it is talking about % of *partitions* that had a replica moved.  Since each object resides in single a partition - the % of partitions affected most directly communicates the % of client objects affected by the rebalance.  We do NOT display % of *partition-replicas* moved because while the number would be smaller - it could never be 100% because of the restriction that only one "replica" may move.</div><div><br></div><div>When doing a large topology change - particularly with EC - it may be the case that more than one replica of each part will need to move (imagine doubling your capacity into a second zone on a 8+4 ring) - so it'll take a few cranks.  Eventually you'll want to have moved 6 replicas of each part (6 in z1 and 6 in z2), but if we allowed you to move six replicas of 100% of your parts you'd only have 6/8 required parts to service reads!</div></div><div><br></div><div>Protip: when you push out the new ring you can turn on handoffs_only mode for the reconstructor for a little while to get things rebalanced MUCH more quickly - just don't forget to turn it off!</div><div><br></div><div>(sending second time because I forgot to reply all to the list)</div><div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 16, 2021 at 11:35 AM Reid Guyett <<a href="mailto:rguyett@datto.com">rguyett@datto.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>

<br>

We were working on expanding one of our clusters (Ussuri on Ubuntu<br>

18.04) and are wondering about the rebalance behavior of<br>

swift-ring-builder. When we run it in debug mode on a 15/4 EC ring, we<br>

see this message about "Unable to finish rebalance plan after 2<br>

attempts" and are seeing 100% partitions reassigned.<br>

<br>

DEBUG: Placed 10899/2 onto dev r1z3-10.40.48.72/d10<br>

DEBUG: Placed 2183/3 onto dev r1z5-10.40.48.76/d11<br>

DEBUG: Placed 1607/1 onto dev r1z3-10.40.48.70/d28<br>

DEBUG: Assigned 32768 parts<br>

DEBUG: Gather start is 10278 (Last start was 25464)<br>

DEBUG: Unable to finish rebalance plan after 2 attempts<br>

Reassigned 32768 (100.00%) partitions. Balance is now 63.21.<br>

Dispersion is now 0.00<br>

-------------------------------------------------------------------------------<br>

NOTE: Balance of 63.21 indicates you should push this<br>

      ring, wait at least 1 hours, and rebalance/repush.<br>

-------------------------------------------------------------------------------<br>

<br>

Moving 100% seems scary, what does that mean in this situation? Is<br>

this message because 1 fragment from every partition is moved and that<br>

is the most that it can do per rebalance because they are technically<br>

the same partition?<br>

When we compare the swift-ring-builder output (partitions per device)<br>

between rebalances we can see some partitions move each time until we<br>

no longer see the push/wait/rebalance message again. So it's not<br>

really moving 100% partitions.<br>

<br>

Reid<br>

<br>

<br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">Clay Gerrard</div></div></div>