[Swift] Rebalancing EC question

Clay Gerrard clay.gerrard at gmail.com
Thu Sep 16 18:02:23 UTC 2021


Not scary!

Because you have a 15/4 EC policy, we say each partition has 19
"replicas".  And since rebalance will only move one "replica" of any
partition max at each rebalance: up to 100% of your partitions may have at
least one replica assignment move.

That means, after you push out this ring, 100% of your object GET requests
will experience at most one "replica" is out of place.  But that's ok!  In
a 15/4 you only need 15 EC fragments to respond successfully and you have
18 total fragments that did NOT get reassigned.

It's unfortunate the language is a little ambiguous, but it is talking
about % of *partitions* that had a replica moved.  Since each object
resides in single a partition - the % of partitions affected most directly
communicates the % of client objects affected by the rebalance.  We do NOT
display % of *partition-replicas* moved because while the number would be
smaller - it could never be 100% because of the restriction that only one
"replica" may move.

When doing a large topology change - particularly with EC - it may be the
case that more than one replica of each part will need to move (imagine
doubling your capacity into a second zone on a 8+4 ring) - so it'll take a
few cranks.  Eventually you'll want to have moved 6 replicas of each part
(6 in z1 and 6 in z2), but if we allowed you to move six replicas of 100%
of your parts you'd only have 6/8 required parts to service reads!

Protip: when you push out the new ring you can turn on handoffs_only mode
for the reconstructor for a little while to get things rebalanced MUCH more
quickly - just don't forget to turn it off!

(sending second time because I forgot to reply all to the list)

On Thu, Sep 16, 2021 at 11:35 AM Reid Guyett <rguyett at datto.com> wrote:

> Hello,
>
> We were working on expanding one of our clusters (Ussuri on Ubuntu
> 18.04) and are wondering about the rebalance behavior of
> swift-ring-builder. When we run it in debug mode on a 15/4 EC ring, we
> see this message about "Unable to finish rebalance plan after 2
> attempts" and are seeing 100% partitions reassigned.
>
> DEBUG: Placed 10899/2 onto dev r1z3-10.40.48.72/d10
> DEBUG: Placed 2183/3 onto dev r1z5-10.40.48.76/d11
> DEBUG: Placed 1607/1 onto dev r1z3-10.40.48.70/d28
> DEBUG: Assigned 32768 parts
> DEBUG: Gather start is 10278 (Last start was 25464)
> DEBUG: Unable to finish rebalance plan after 2 attempts
> Reassigned 32768 (100.00%) partitions. Balance is now 63.21.
> Dispersion is now 0.00
>
> -------------------------------------------------------------------------------
> NOTE: Balance of 63.21 indicates you should push this
>       ring, wait at least 1 hours, and rebalance/repush.
>
> -------------------------------------------------------------------------------
>
> Moving 100% seems scary, what does that mean in this situation? Is
> this message because 1 fragment from every partition is moved and that
> is the most that it can do per rebalance because they are technically
> the same partition?
> When we compare the swift-ring-builder output (partitions per device)
> between rebalances we can see some partitions move each time until we
> no longer see the push/wait/rebalance message again. So it's not
> really moving 100% partitions.
>
> Reid
>
>
>

-- 
Clay Gerrard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210916/be060071/attachment.htm>


More information about the openstack-discuss mailing list