[Openstack] Bad perf on swift servers...

Kuo Hugo tonytkdk at gmail.com
Fri May 30 02:19:10 UTC 2014


Hi ,

1. Correct ! Once you adding new devices and rebalance rings, portion of
partitions will be reassigned to new devices. If those partitions were used
by some objects, object-replicator is going to move data to new devices.
You should see logs of object-replicator to transfer objects from one
device to another by invoking rsync.

2. Regarding to busy swift-account-server, that's pretty abnormal tho. Is
there any log indicating account-server doing any jobs?   A possibility is
the ring which includes wrong port number of other workers to
account-server. Perhaps you can paste all your rings layout to
http://paste.openstack.org/ . To use strace on account-server process may
help to track the exercise.

3. In kind of deployment that outward-facing interface shares same network
resource with cluster-facing interface, it definitely causes some race on
network utilization. Hence the frontend traffic is under impact by
replication traffic now.

4. To have a detail network topology diagram will help.

Hugo Kuo


2014-05-29 1:06 GMT+08:00 Shyam Prasad N <nspmangalore at gmail.com>:

> Hi,
>
> Confused about the right mailing list to ask this question. So including
> both openstack and openstack-dev in the CC list.
>
> I'm running a swift cluster with 4 nodes.
> All 4 nodes are symmetrical. i.e. proxy, object, container, and account
> servers running on each with similar storage configuration and conf files.
> The I/O traffic to this cluster is mainly to upload dynamic large objects
> (typically 1GB chunks (sub-objects) and around 5-6 chunks under each large
> object).
>
> The setup is running and serving data; but I've begun to see a few perf
> issues, as the traffic increases. I want to understand the reason behind
> some of these issues, and make sure that there is nothing wrong with the
> setup configuration.
>
> 1. High CPU utilization from rsync. I have set replica count in each of
> account, container, and object rings to 2. From what I've read, this
> assigns 2 devices for each partition in the storage cluster. And for each
> PUT, the 2 replicas should be written synchronously. And for GET, the I/O
> is through one of the object servers. So nothing here should be
> asynchronous in nature. Then what is causing the rsync traffic here?
>
> I recently ran a ring rebalance command after adding a node recently.
> Could this be causing the issue?
>
> 2. High CPU utilization from swift-account-server threads. All my frontend
> traffic use 1 account and 1 container on the servers. There are hundreds of
> such objects in the same container. I don't understand what's keeping the
> account servers busy.
>
> 3. I've started noticing that the 1GB object transfers of the frontend
> traffic are taking significantly more time than they used to (more than
> double the time). Could this be because i'm using the same subnet for both
> the internal and the frontend traffic.
>
> 4. Can someone provide me some pointers/tips to improving perf for my
> cluster configuration? (I guess I've given out most details above. Feel
> free to ask if you need more details)
>
> As always, thanks in advance for your replies. Appreciate the support. :)
> --
> -Shyam
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140530/0b4bba9e/attachment.html>


More information about the Openstack mailing list