<html dir="ltr">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<style>

<!--

@font-face

        {font-family:"Cambria Math"}

@font-face

        {font-family:Calibri}

@font-face

        {font-family:Tahoma}

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif"}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline}

a:visited, span.MsoHyperlinkFollowed

        {color:purple;

        text-decoration:underline}

span.E-MailFormatvorlage17

        {font-family:"Calibri","sans-serif";

        color:#1F497D}

.MsoChpDefault

        {font-family:"Calibri","sans-serif"}

@page WordSection1

        {margin:70.85pt 70.85pt 2.0cm 70.85pt}

-->

</style><style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>

</head>

<body ocsi="0" fpstyle="1" lang="DE" link="blue" vlink="purple">

<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Just a follow up on this thread because I've took some time to write up our experiences:<br>

<a href="http://engineering.spilgames.com/openstack-swift-lots-small-files/" target="_blank">http://engineering.spilgames.com/openstack-swift-lots-small-files/</a><br>

<br>

Klaus, <br>

<br>

Answering your question on initial sync times:<br>

Yes, we also see long initials syncs. <br>

For us it will take a few days for a new node to be synced. <br>

Usually it goes pretty quickly at first (30 MB/second) and the performance gradually degrades when the disks start filling up and the machines are running low on memory.<br>

We have about 6TB on a node to sync.<br>

<br>

Cheers,<br>

Robert van Leeuwen<br>

<br>

<br>

<br>

<div style="font-family: Times New Roman; color: #000000; font-size: 16px">

<hr tabindex="-1">

<div style="direction: ltr;" id="divRpF731726"><font color="#000000" face="Tahoma" size="2"><b>From:</b> Klaus Schürmann [klaus.schuermann@mediabeam.com]<br>

<b>Sent:</b> Tuesday, August 20, 2013 9:04 AM<br>

<b>To:</b> Maximiliano Venesio; Robert van Leeuwen<br>

<b>Cc:</b> openstack@lists.openstack.org<br>

<b>Subject:</b> AW: [Openstack] [SWIFT] PUTs and GETs getting slower<br>

</font><br>

</div>

<div></div>

<div>

<div class="WordSection1">

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Hi,</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">after adding additional disks and storing the account- and container-server on SSDs the performance is much better:</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Before:</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">GETs      average               620 ms</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">PUTs     average               1900 ms</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">After:</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">GETs      average               280 ms</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">PUTs     average               1100 ms</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Only the rebalance process took days to sync all the data to the additional five disks (before each storage node had 3 disks). I used a concurrency

 of 4. One round to replicate all partitions took over 24 hours. After five days the replicate process takes only 300 seconds.</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Each additional disk has now 300 GB data stored. Is such duration normal to sync the data?</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Thanks</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US">Klaus</span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="EN-US"> </span></p>

<p class="MsoNormal"><b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"" lang="EN-US">Von:</span></b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"" lang="EN-US"> Maximiliano Venesio [mailto:maximiliano.venesio@mercadolibre.com]

<br>

</span><b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"">Gesendet:</span></b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif""> Donnerstag, 8. August 2013 17:26<br>

<b>An:</b> Robert van Leeuwen<br>

<b>Cc:</b> openstack@lists.openstack.org<br>

<b>Betreff:</b> Re: [Openstack] [SWIFT] PUTs and GETs getting slower</span></p>

<p class="MsoNormal"> </p>

<div>

<p class="MsoNormal">Hi Robert, </p>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<div>

<p class="MsoNormal">I was reading your post and is interesting because we have similar swift deployments and uses cases. </p>

</div>

<div>

<p class="MsoNormal">We are storing millons of small images in our swift cluster, 32 Storage nodes w/12 - 2TB HDD + 2 SSD each one, and we are having an total average of 200k rpm in whole cluster.</p>

</div>

<div>

<p class="MsoNormal">In terms of % of util. of our disks,  we have an average of 50% of util in all our disks but we just are using a 15% of the total capacity of them.</p>

</div>

<div>

<p class="MsoNormal">When I look at used inodes on our object nodes with "df -i" we hit about 17 million inodes per disk.</p>

</div>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<p class="MsoNormal">So it seems a big number of inodes considering that we are using just a 15% of the total capacity. A different thing here is that we are using 512K of inode size and we have a big amount of memory . </p>

</div>

<div>

<p class="MsoNormal">Also we always have one of our disks close to 100% of util, and this is caused by the object-auditor that scans all our disks continuously.  </p>

</div>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<p class="MsoNormal">So we was also thinking in the possibility to change the kind of disks that we are using, to use smaller and faster disks.</p>

</div>

<div>

<p class="MsoNormal">Will be really util to know what kind of disks are you using in your old and new storage nodes, and compare that with our case.</p>

</div>

</div>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<p class="MsoNormal"><span style="font-size:10.0pt; font-family:"Arial","sans-serif"">Cheers,</span></p>

</div>

<div>

<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">Max</span></p>

</div>

</div>

<div>

<p class="MsoNormal"><br clear="all">

</p>

<div>

<div>

<p class="MsoNormal"> </p>

</div>

<div>

<p class="MsoNormal"><b><span style="font-family:"Arial","sans-serif"; color:#888888; background:white"><img id="_x0000_i1025" src="http://s14.postimage.org/sg1lztqep/cloudbuilders_Logo_last_small.png" height="58" width="96"></span><span style="background:white"></span></b></p>

</div>

<div>

<p class="MsoNormal"><b><span style="background:white"> </span></b></p>

</div>

<p class="MsoNormal"><b><span style="font-family:"Arial","sans-serif"; color:#333333; background:white">Maximiliano Venesio</span></b><b><span style="font-family:"Arial","sans-serif"; color:#888888; background:white"> </span></b><span style="background:white"><br>

</span><b><span style="font-family:"Arial","sans-serif"; color:#888888; background:white">#melicloud CloudBuilders</span></b><span style="font-size:10.0pt; font-family:"Arial","sans-serif"; color:#888888"><br>

</span><span style="font-size:6.0pt; font-family:"Arial","sans-serif"; color:gray; background:white" lang="ES">Arias 3751, Piso 7 (C1430CRG) <br>

Ciudad de Buenos Aires - Argentina<br>

Cel: +549(11) 15-3770-1853<br>

Tel : +54(11) 4640-8411</span></p>

</div>

<p class="MsoNormal" style="margin-bottom:12.0pt"> </p>

<div>

<p class="MsoNormal">On Tue, Aug 6, 2013 at 11:54 AM, Robert van Leeuwen <<a href="mailto:Robert.vanLeeuwen@spilgames.com" target="_blank">Robert.vanLeeuwen@spilgames.com</a>> wrote:</p>

<p class="MsoNormal">Could you check your disk IO on the container /object nodes?<br>

<br>

We have quite a lot of files in swift and for comparison purposes I played a bit with COSbench to see where we hit the limits.<br>

We currently max out at about 200 - 300 put request/second and the bottleneck is the disk IO on the object nodes<br>

Our account / container nodes are on SSD's and are not a limiting factor.<br>

<br>

You can look for IO bottlenecks with e.g. "iostat -x 10" (this will refresh the view every 10 seconds.)<br>

During the benchmark is see some of the disks are hitting 100% utilization.<br>

That it is hitting the IO limits with just 200 puts a second has to do with the number of files on the disks.<br>

When I look at used inodes on our object nodes with "df -i" we hit about 60 million inodes per disk.<br>

(a significant part of that are actually directories I calculated about 30 million files based on the number of files in swift)<br>

We use flashcache in front of those disks and it is still REALLY slow, just doing a "ls" can take up to 30 seconds.<br>

Probably adding lots of memory should help caching the inodes in memory but that is quite challenging:<br>

I am not sure how big a directory is in the xfs inode tree but just the files:<br>

30 million x 1k inodes =  30GB<br>

And that is just one disk :)<br>

<br>

We still use the old recommended inode size of 1k and the default of 256 can be used now with recent kernels:<br>

<a href="https://lists.launchpad.net/openstack/msg24784.html" target="_blank">https://lists.launchpad.net/openstack/msg24784.html</a><br>

<br>

So sometime ago we decided to go for nodes with more,smaller & faster disks with more memory.<br>

Those machines are not even close to their limits however we still have more "old" nodes<br>

so performance is limited by those machines.<br>

At this moment it is sufficient for our use case but I am pretty confident we would be able to<br>

significantly improve performance by adding more of those machines and doing some re-balancing of the load.<br>

<br>

Cheers,<br>

Robert van Leeuwen</p>

<div>

<div>

<p class="MsoNormal">_______________________________________________<br>

Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

Post to     : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br>

Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a></p>

</div>

</div>

</div>

<p class="MsoNormal"> </p>

</div>

</div>

</div>

</div>

</div>

</body>

</html>