[openstack-dev] Objects not getting distributed across the swift cluster...
Shyam Prasad N
nspmangalore at gmail.com
Fri May 2 05:53:57 UTC 2014
Hi John,
Thanks for the explanation. Have a couple of more questions on this subject
though.
1. "pretend_min_hours_passed" sounds like something that I could use. I'm
okay if there is a chance of interruption in services to the user at this
time, as long as it does not cause any data-loss or data-corruption.
2. It would have been really useful if the rebalancing operations could be
logged by swift somewhere and automatically run later (after
min_part_hours).
Regards,
Shyam
On Thu, May 1, 2014 at 11:15 PM, John Dickinson <me at not.mn> wrote:
>
> On May 1, 2014, at 10:32 AM, Shyam Prasad N <nspmangalore at gmail.com>
> wrote:
>
> > Hi Chuck,
> > Thanks for the reply.
> >
> > The reason for such weight distribution seems to do with the ring
> rebalance command. I've scripted the disk addition (and rebalance) process
> to the ring using a wrapper command. When I trigger the rebalance after
> each disk addition, only the first rebalance seems to take effect.
> >
> > Is there any other way to adjust the weights other than rebalance? Or is
> there a way to force a rebalance, even if the frequency of the rebalance
> (as a part of disk addition) is under an hour (the min_part_hours value in
> ring creation).
>
> Rebalancing only moves one replica at a time to ensure that your data
> remains available, even if you have a hardware failure while you are adding
> capacity. This is why it may take multiple rebalances to get everything
> evenly balanced.
>
> The min_part_hours setting (perhaps poorly named) should match how long a
> replication pass takes in your cluster. You can understand this because of
> what I said above. By ensuring that replication has completed before
> putting another partition "in flight", Swift can ensure that you keep your
> data highly available.
>
> For completeness to answer your question, there is an (intentionally)
> undocumented option in swift-ring-builder called
> "pretend_min_part_hours_passed", but it should ALMOST NEVER be used in a
> production cluster, unless you really, really know what you are doing.
> Using that option will very likely cause service interruptions to your
> users. The better option is to correctly set the min_part_hours value to
> match your replication pass time (with set_min_part_hours), and then wait
> for swift to move things around.
>
> Here's some more info on how and why to add capacity to a running Swift
> cluster: https://swiftstack.com/blog/2012/04/09/swift-capacity-management/
>
> --John
>
>
>
>
>
> > On May 1, 2014 9:00 PM, "Chuck Thier" <cthier at gmail.com> wrote:
> > Hi Shyam,
> >
> > If I am reading your ring output correctly, it looks like only the
> devices in node .202 have a weight set, and thus why all of your objects
> are going to that one node. You can update the weight of the other
> devices, and rebalance, and things should get distributed correctly.
> >
> > --
> > Chuck
> >
> >
> > On Thu, May 1, 2014 at 5:28 AM, Shyam Prasad N <nspmangalore at gmail.com>
> wrote:
> > Hi,
> >
> > I created a swift cluster and configured the rings like this...
> >
> > swift-ring-builder object.builder create 10 3 1
> >
> > ubuntu-202:/etc/swift$ swift-ring-builder object.builder
> > object.builder, build version 12
> > 1024 partitions, 3.000000 replicas, 1 regions, 4 zones, 12 devices,
> 300.00 balance
> > The minimum number of hours before a partition can be reassigned is 1
> > Devices: id region zone ip address port replication ip
> replication port name weight partitions balance meta
> > 0 1 1 10.3.0.202 6010 10.3.0.202
> 6010 xvdb 1.00 1024 300.00
> > 1 1 1 10.3.0.202 6020 10.3.0.202
> 6020 xvdc 1.00 1024 300.00
> > 2 1 1 10.3.0.202 6030 10.3.0.202
> 6030 xvde 1.00 1024 300.00
> > 3 1 2 10.3.0.212 6010 10.3.0.212
> 6010 xvdb 1.00 0 -100.00
> > 4 1 2 10.3.0.212 6020 10.3.0.212
> 6020 xvdc 1.00 0 -100.00
> > 5 1 2 10.3.0.212 6030 10.3.0.212
> 6030 xvde 1.00 0 -100.00
> > 6 1 3 10.3.0.222 6010 10.3.0.222
> 6010 xvdb 1.00 0 -100.00
> > 7 1 3 10.3.0.222 6020 10.3.0.222
> 6020 xvdc 1.00 0 -100.00
> > 8 1 3 10.3.0.222 6030 10.3.0.222
> 6030 xvde 1.00 0 -100.00
> > 9 1 4 10.3.0.232 6010 10.3.0.232
> 6010 xvdb 1.00 0 -100.00
> > 10 1 4 10.3.0.232 6020 10.3.0.232
> 6020 xvdc 1.00 0 -100.00
> > 11 1 4 10.3.0.232 6030 10.3.0.232
> 6030 xvde 1.00 0 -100.00
> >
> > Container and account rings have a similar configuration.
> > Once the rings were created and all the disks were added to the rings
> like above, I ran rebalance on each ring. (I ran rebalance after adding
> each of the node above.)
> > Then I immediately scp the rings to all other nodes in the cluster.
> >
> > I now observe that the objects are all going to 10.3.0.202. I don't see
> the objects being replicated to the other nodes. So much so that 202 is
> approaching 100% disk usage, while other nodes are almost completely empty.
> > What am I doing wrong? Am I not supposed to run rebalance operation
> after addition of each disk/node?
> >
> > Thanks in advance for the help.
> >
> > --
> > -Shyam
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
--
-Shyam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140502/239eed00/attachment.html>
More information about the OpenStack-dev
mailing list