<div dir="ltr">Hi Robert, <div><br></div><div><div>I was reading your post and is interesting because we have similar swift deployments and uses cases. </div><div>We are storing millons of small images in our swift cluster, 32 Storage nodes w/12 - 2TB HDD + 2 SSD each one, and we are having an total average of 200k rpm in whole cluster.</div>


<div>In terms of % of util. of our disks,  we have an average of 50% of util in all our disks but we just are using a 15% of the total capacity of them.</div><div>When I look at used inodes on our object nodes with "df -i" we hit about 17 million inodes per disk.</div>


<div><br></div><div>So it seems a big number of inodes considering that we are using just a 15% of the total capacity. A different thing here is that we are using 512K of inode size and we have a big amount of memory . </div>


<div>Also we always have one of our disks close to 100% of util, and this is caused by the object-auditor that scans all our disks continuously.  </div><div><br></div><div>So we was also thinking in the possibility to change the kind of disks that we are using, to use smaller and faster disks.</div>


<div>Will be really util to know what kind of disks are you using in your old and new storage nodes, and compare that with our case.<br></div>

</div><div><font face="arial, sans-serif"><br></font></div><div><br></div><div><span style="font-family:arial,sans-serif;font-size:13px">Cheers,</span><br></div><div><font face="arial, sans-serif">Max</font></div></div><div class="gmail_extra">


<br clear="all"><div><div><font style="color:rgb(136,136,136);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><b><br></b></font></div><font style="background-color:rgb(255,255,255)"><div style="font-weight:bold">


<font style="background-color:rgb(255,255,255)" color="#888888" face="arial, sans-serif"><img src="http://s14.postimage.org/sg1lztqep/cloudbuilders_Logo_last_small.png" width="96" height="58"></font></div><div style="font-weight:bold">


<font style="background-color:rgb(255,255,255)" color="#888888" face="arial, sans-serif"><br></font></div><font face="arial, sans-serif"><b><font color="#333333">Maximiliano Venesio</font><font color="#888888"> </font></b></font><br>


<font color="#888888" face="arial, sans-serif" style="font-weight:bold">#melicloud CloudBuilders</font></font><br style="color:rgb(136,136,136);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">


<font color="#666666" style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><span lang="ES" style="font-size:6pt;color:gray">Arias 3751, Piso 7 (C1430CRG) <br>Ciudad de Buenos Aires - Argentina<br>


Cel: +549(11) 15-3770-1853<br>Tel : +54(11) 4640-8411</span></font></div>

<br><br><div class="gmail_quote">On Tue, Aug 6, 2013 at 11:54 AM, Robert van Leeuwen <span dir="ltr"><<a href="mailto:Robert.vanLeeuwen@spilgames.com" target="_blank">Robert.vanLeeuwen@spilgames.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Could you check your disk IO on the container /object nodes?<br>

<br>

We have quite a lot of files in swift and for comparison purposes I played a bit with COSbench to see where we hit the limits.<br>

We currently max out at about 200 - 300 put request/second and the bottleneck is the disk IO on the object nodes<br>

Our account / container nodes are on SSD's and are not a limiting factor.<br>

<br>

You can look for IO bottlenecks with e.g. "iostat -x 10" (this will refresh the view every 10 seconds.)<br>

During the benchmark is see some of the disks are hitting 100% utilization.<br>

That it is hitting the IO limits with just 200 puts a second has to do with the number of files on the disks.<br>

When I look at used inodes on our object nodes with "df -i" we hit about 60 million inodes per disk.<br>

(a significant part of that are actually directories I calculated about 30 million files based on the number of files in swift)<br>

We use flashcache in front of those disks and it is still REALLY slow, just doing a "ls" can take up to 30 seconds.<br>

Probably adding lots of memory should help caching the inodes in memory but that is quite challenging:<br>

I am not sure how big a directory is in the xfs inode tree but just the files:<br>

30 million x 1k inodes =  30GB<br>

And that is just one disk :)<br>

<br>

We still use the old recommended inode size of 1k and the default of 256 can be used now with recent kernels:<br>

<a href="https://lists.launchpad.net/openstack/msg24784.html" target="_blank">https://lists.launchpad.net/openstack/msg24784.html</a><br>

<br>

So sometime ago we decided to go for nodes with more,smaller & faster disks with more memory.<br>

Those machines are not even close to their limits however we still have more "old" nodes<br>

so performance is limited by those machines.<br>

At this moment it is sufficient for our use case but I am pretty confident we would be able to<br>

significantly improve performance by adding more of those machines and doing some re-balancing of the load.<br>

<br>

Cheers,<br>

Robert van Leeuwen<br>

<div class="HOEnZb"><div class="h5">_______________________________________________<br>

Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

Post to     : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>

Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

</div></div></blockquote></div><br></div>