[openstack-dev] Objects not getting distributed across the swift cluster...

Shyam Prasad N nspmangalore at gmail.com
Fri May 2 05:53:57 UTC 2014


Hi John,

Thanks for the explanation. Have a couple of more questions on this subject
though.

1. "pretend_min_hours_passed" sounds like something that I could use. I'm
okay if there is a chance of interruption in services to the user at this
time, as long as it does not cause any data-loss or data-corruption.
2. It would have been really useful if the rebalancing operations could be
logged by swift somewhere and automatically run later (after
min_part_hours).

Regards,
Shyam


On Thu, May 1, 2014 at 11:15 PM, John Dickinson <me at not.mn> wrote:

>
> On May 1, 2014, at 10:32 AM, Shyam Prasad N <nspmangalore at gmail.com>
> wrote:
>
> > Hi Chuck,
> > Thanks for the reply.
> >
> > The reason for such weight distribution seems to do with the ring
> rebalance command. I've scripted the disk addition (and rebalance) process
> to the ring using a wrapper command. When I trigger the rebalance after
> each disk addition, only the first rebalance seems to take effect.
> >
> > Is there any other way to adjust the weights other than rebalance? Or is
> there a way to force a rebalance, even if the frequency of the rebalance
> (as a part of disk addition) is under an hour (the min_part_hours value in
> ring creation).
>
> Rebalancing only moves one replica at a time to ensure that your data
> remains available, even if you have a hardware failure while you are adding
> capacity. This is why it may take multiple rebalances to get everything
> evenly balanced.
>
> The min_part_hours setting (perhaps poorly named) should match how long a
> replication pass takes in your cluster. You can understand this because of
> what I said above. By ensuring that replication has completed before
> putting another partition "in flight", Swift can ensure that you keep your
> data highly available.
>
> For completeness to answer your question, there is an (intentionally)
> undocumented option in swift-ring-builder called
> "pretend_min_part_hours_passed", but it should ALMOST NEVER be used in a
> production cluster, unless you really, really know what you are doing.
> Using that option will very likely cause service interruptions to your
> users. The better option is to correctly set the min_part_hours value to
> match your replication pass time (with set_min_part_hours), and then wait
> for swift to move things around.
>
> Here's some more info on how and why to add capacity to a running Swift
> cluster: https://swiftstack.com/blog/2012/04/09/swift-capacity-management/
>
> --John
>
>
>
>
>
> > On May 1, 2014 9:00 PM, "Chuck Thier" <cthier at gmail.com> wrote:
> > Hi Shyam,
> >
> > If I am reading your ring output correctly, it looks like only the
> devices in node .202 have a weight set, and thus why all of your objects
> are going to that one node.  You can update the weight of the other
> devices, and rebalance, and things should get distributed correctly.
> >
> > --
> > Chuck
> >
> >
> > On Thu, May 1, 2014 at 5:28 AM, Shyam Prasad N <nspmangalore at gmail.com>
> wrote:
> > Hi,
> >
> > I created a swift cluster and configured the rings like this...
> >
> > swift-ring-builder object.builder create 10 3 1
> >
> > ubuntu-202:/etc/swift$ swift-ring-builder object.builder
> > object.builder, build version 12
> > 1024 partitions, 3.000000 replicas, 1 regions, 4 zones, 12 devices,
> 300.00 balance
> > The minimum number of hours before a partition can be reassigned is 1
> > Devices:    id  region  zone      ip address  port  replication ip
>  replication port      name weight partitions balance meta
> >              0       1     1      10.3.0.202  6010      10.3.0.202
>        6010      xvdb   1.00       1024  300.00
> >              1       1     1      10.3.0.202  6020      10.3.0.202
>        6020      xvdc   1.00       1024  300.00
> >              2       1     1      10.3.0.202  6030      10.3.0.202
>        6030      xvde   1.00       1024  300.00
> >              3       1     2      10.3.0.212  6010      10.3.0.212
>        6010      xvdb   1.00          0 -100.00
> >              4       1     2      10.3.0.212  6020      10.3.0.212
>        6020      xvdc   1.00          0 -100.00
> >              5       1     2      10.3.0.212  6030      10.3.0.212
>        6030      xvde   1.00          0 -100.00
> >              6       1     3      10.3.0.222  6010      10.3.0.222
>        6010      xvdb   1.00          0 -100.00
> >              7       1     3      10.3.0.222  6020      10.3.0.222
>        6020      xvdc   1.00          0 -100.00
> >              8       1     3      10.3.0.222  6030      10.3.0.222
>        6030      xvde   1.00          0 -100.00
> >              9       1     4      10.3.0.232  6010      10.3.0.232
>        6010      xvdb   1.00          0 -100.00
> >             10       1     4      10.3.0.232  6020      10.3.0.232
>        6020      xvdc   1.00          0 -100.00
> >             11       1     4      10.3.0.232  6030      10.3.0.232
>        6030      xvde   1.00          0 -100.00
> >
> > Container and account rings have a similar configuration.
> > Once the rings were created and all the disks were added to the rings
> like above, I ran rebalance on each ring. (I ran rebalance after adding
> each of the node above.)
> > Then I immediately scp the rings to all other nodes in the cluster.
> >
> > I now observe that the objects are all going to 10.3.0.202. I don't see
> the objects being replicated to the other nodes. So much so that 202 is
> approaching 100% disk usage, while other nodes are almost completely empty.
> > What am I doing wrong? Am I not supposed to run rebalance operation
> after addition of each disk/node?
> >
> > Thanks in advance for the help.
> >
> > --
> > -Shyam
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
-Shyam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140502/239eed00/attachment.html>


More information about the OpenStack-dev mailing list