[Openstack] [SWIFT] PUTs and GETs getting slower

Robert van Leeuwen Robert.vanLeeuwen at spilgames.com
Tue Aug 6 14:54:52 UTC 2013

Could you check your disk IO on the container /object nodes?

We have quite a lot of files in swift and for comparison purposes I played a bit with COSbench to see where we hit the limits.
We currently max out at about 200 - 300 put request/second and the bottleneck is the disk IO on the object nodes
Our account / container nodes are on SSD's and are not a limiting factor.

You can look for IO bottlenecks with e.g. "iostat -x 10" (this will refresh the view every 10 seconds.)
During the benchmark is see some of the disks are hitting 100% utilization.
That it is hitting the IO limits with just 200 puts a second has to do with the number of files on the disks.
When I look at used inodes on our object nodes with "df -i" we hit about 60 million inodes per disk.
(a significant part of that are actually directories I calculated about 30 million files based on the number of files in swift)
We use flashcache in front of those disks and it is still REALLY slow, just doing a "ls" can take up to 30 seconds.
Probably adding lots of memory should help caching the inodes in memory but that is quite challenging:
I am not sure how big a directory is in the xfs inode tree but just the files:
30 million x 1k inodes =  30GB
And that is just one disk :)

We still use the old recommended inode size of 1k and the default of 256 can be used now with recent kernels:

So sometime ago we decided to go for nodes with more,smaller & faster disks with more memory.
Those machines are not even close to their limits however we still have more "old" nodes
so performance is limited by those machines.
At this moment it is sufficient for our use case but I am pretty confident we would be able to 
significantly improve performance by adding more of those machines and doing some re-balancing of the load.

Robert van Leeuwen

More information about the Openstack mailing list