<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><meta name="Generator" content="Microsoft Word 12 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Tahoma","sans-serif";}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle22
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1659185528;
mso-list-type:hybrid;
mso-list-template-ids:1572868358 1913973220 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.75in;
text-indent:-.25in;
mso-ansi-font-weight:bold;}
@list l1
{mso-list-id:2019431291;
mso-list-type:hybrid;
mso-list-template-ids:124044422 -1145262076 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.75in;
text-indent:-.25in;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style></head><body lang="EN-US" link="blue" vlink="purple"><div class="WordSection1"><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">Hi All,</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">I have attached a simple timeline of proxy+object latency chart for a single PUT request. Please check.</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">I am profiling the swift proxy + object server to improve the latency of single PUT request. This may help to improve the overall OPS performance.</span></p><p class="MsoNormal"><span style="color:#1f497d">Test Configuration : 4CPU + 16GB + 1 proxy node + 1 storage node + 1 replica for object ring, 3 replica for container ring on SSD; perform 4k PUT</span><span style="color:#1f497d"> (one-by-one)</span><span style="color:#1f497d"> request</span><span style="color:#1f497d">.</span></p><p class="MsoNormal"><span style="color:#1f497d">Every 4K PUT request in the above case takes 22ms (30ms for 3 replica-count for object). Target is to bring the per 4K put request below 10ms to double the overall OPS performance.</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">There are some potential places where we can improve the latency to achieve this. Can you please provide your thoughts.</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><b><span style="color:#1f497d">Performance optimization-1: </span></b><span style="color:#1f497d">Proxy server don’t have to get blocked in connect() - getexpect() until object-server responds.<b></b></span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Problem Today: </span></b><span style="color:#1f497d">On PUT request, the proxy server connect_put_node() wait for the response from the object server (getexpect()) after the connection is established. Once the response (‘HTTP_CONTINUE’) is received, the proxy server goes ahead and spawn the send_file thread to send data to object server’s. There code looks serialized between proxy and object server.</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Optimization</span></b><span style="color:#1f497d">:</span></p><p class="MsoListParagraph" style="margin-left:.75in"><b><span style="color:#1f497d">Option1:</span></b><span style="color:#1f497d"> Avoid waiting for all the connect to complete before proceeding with the send_data to the connected object-server’s ? </span></p><p class="MsoListParagraph" style="margin-left:.75in"><b><span style="color:#1f497d">Option2:</span></b><span style="color:#1f497d"> The purpose of the getexpect() call is not very clear. Can we relax this, so that the proxy server will go-ahead read the data_source and send it to the object server quickly after the connection is established. We may have to handle extra failure cases here. (FYI: This reduce 3ms for a single PUT request ).</span></p><p class="MsoListParagraph" style="margin-left:1.0in"><span style="color:#1f497d"> def _connect_put_node(self, nodes, part, path, headers,</span></p><p class="MsoListParagraph" style="margin-left:1.0in"><span style="color:#1f497d"> logger_thread_locals, req):</span></p><p class="MsoListParagraph" style="margin-left:1.0in"><span style="color:#1f497d"> """Method for a file PUT connect"""</span></p><p class="MsoListParagraph" style="margin-left:1.0in"><span style="color:#1f497d"> ………..<br> <b>with Timeout(self.app.node_timeout):</b></span></p><p class="MsoListParagraph" style="margin-left:1.0in"><b><span style="color:#1f497d"> resp = conn.getexpect()</span></b></p><p class="MsoListParagraph" style="margin-left:1.0in"><span style="color:#1f497d"> ………</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><b><span style="color:#1f497d">Performance Optimization-2</span></b><span style="color:#1f497d">: Object server serialize the container_update after the data write.</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Problem Today:</span></b><span style="color:#1f497d"> On PUT request, the object server, after writing the data and meta data, the container_update() is called, which is serialized to all storage nodes (3 way). Each container update take 3 millisecond and it adds to 9 millisecond for the container_update to complete.</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Optimization:</span></b><span style="color:#1f497d"> Can we make this parallel using the green thread, and probably <b>return success on the first successful container update</b>, if there is no connection error? I am trying to understand whether this will have any data integrity issues, can you please provide your feed back on this ?</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">(FYI:</span></b><span style="color:#1f497d"> this reduce atlest 5 millisecond)</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><b><span style="color:#1f497d">Performance Optimization-3</span></b><span style="color:#1f497d">: write(metadata) in object server takes 2 to 3 millisecond</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Problem today:</span></b><span style="color:#1f497d"> After writing the data to the file, writer.put(metadata) -> _<i>finalize</i>_put() to process the post write operation. This takes an average of 3 millisecond for every put request.</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Optimization:</span></b><span style="color:#1f497d"> </span></p><p class="MsoListParagraph" style="margin-left:.75in"><b><span style="color:#1f497d">Option 1:</span></b><span style="color:#1f497d"> Is it possible to flush the file (group of files) asynchronously in _<i>finalize</i>_put() </span></p><p class="MsoListParagraph" style="margin-left:.75in"><b><span style="color:#1f497d">Option 2:</span></b><span style="color:#1f497d"> Can we make this put(metadata) an asynchronous call ? so the container update can happen in parallel ? </span><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white">Error conditions must be handled properly.</span><span style="color:#1f497d"></span></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white"> </span></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white">I would like to know, whether we have done any work done in this area, so not to repeat the effort.</span><span style="color:#1f497d"></span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">The motivation for this work, is because 30millisecond for a single 4K I/O looks too high. With this the only way to scale is to put more server’s. Trying to see whether we can achieve anything quickly to modify some portion of code or this may require quite a bit of code-rewrite. </span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">Also, suggest whether this approach/work on reducing latency of 1 PUT request is correct ?</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">Thanks</span></p><p class="MsoNormal"><span style="color:#1f497d">-kiru</span></p><p class="MsoNormal" style="margin-left:.5in"><span style="color:#1f497d"> </span></p><div><div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Shyam Kaushik [<a href="mailto:shyam@zadarastorage.com">mailto:shyam@zadarastorage.com</a>] <br><b>Sent:</b> Friday, September 04, 2015 11:53 AM<br><b>To:</b> Kirubakaran Kaliannan<br><b>Subject:</b> RE: profiling per I/O logs</span></p></div></div><p class="MsoNormal"> </p><p class="MsoNormal"><b><span style="color:#1f497d">Hi Kiru,</span></b></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">I listed couple of optimization options like below. Can you pls list down 3-4 optimizations like below in similar format & pass it back to me for a quick review. Once we finalize lets bounce it with community on what they think.</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><b><span style="color:#1f497d">Performance optimization-1:</span></b><span style="color:#1f497d"> Proxy-server - on PUT request drive client side independent of auth/object-server connection establishment</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Problem today:</span></b><span style="color:#1f497d"> on PUT request, client connects/puts header to proxy-server. Proxy-server goes to auth & then looks up ring, connects to each of object-server sending a header. Then when object-servers accept the connection, proxy-server sends HTTP continue to client & now client writes data into proxy-server & then proxy-server writes data to the object-servers</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Optimization:</span></b><span style="color:#1f497d"> Proxy-server can drive the client side independent of backend side. i.e. upon auth completes, proxy-server through a thread can send HTTP continue to client & ask for the data to be written. In the background it can try to connect to object-server writing the header. This way when the backend side is doing work, parallel work is done at proxy front-end thereby reducing latency on the overall IO processing</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d"><<Can you pls confirm if this is the case>></span></p><p class="MsoNormal"><b><span style="color:#1f497d">Performance optimization-2:</span></b><span style="color:#1f497d"> Proxy does TCP connect/disconnect on every PUT to object-server & similarly object-server to container-server updates</span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Problem today:</span></b><span style="color:#1f497d"> swift/common/bufferedhttp.py does TCP connect for every BufferedHTTPConnection::connect(). </span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="color:#1f497d">Optimization:</span></b><span style="color:#1f497d"> Maintain TCP connection pool below bufferedhttp.py. refcounted pool. Connection pool manager periodically cleans up unreferenced connections. Re-use past tcp connections for quicker HTTPConnection</span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><p class="MsoNormal"><span style="color:#1f497d">--Shyam </span></p><p class="MsoNormal"><span style="color:#1f497d"> </span></p><div><div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Kirubakaran Kaliannan [mailto:<a href="mailto:kirubak@zadarastorage.com">kirubak@zadarastorage.com</a>] <br><b>Sent:</b> Thursday, September 03, 2015 3:10 PM<br><b>To:</b> Shyam Kaushik<br><b>Subject:</b> profiling per I/O logs</span></p></div></div><p class="MsoNormal"> </p><p class="MsoNormal"> </p><p class="MsoNormal">Hi Shyam,</p><p class="MsoNormal"> </p><p class="MsoNormal">You can look at the directory</p><p class="MsoNormal"> </p><p class="MsoNormal">/mnt/work/kirubak/profile/perio/*</p><p class="MsoNormal"> </p><p class="MsoNormal">This is for single object replica with 3 way container.</p><p class="MsoNormal"> </p><p class="MsoNormal">The list of potential identified task that we can work with community are (in issues section – P1, O1, O2 – which we can discuss)</p><p class="MsoNormal"> </p><p class="MsoNormal"><a href="https://docs.google.com/spreadsheets/d/1We577s7CQRfq2RmpPCN04kEHc8HD_4g_54ELPn-F0g0/edit#gid=288817690">https://docs.google.com/spreadsheets/d/1We577s7CQRfq2RmpPCN04kEHc8HD_4g_54ELPn-F0g0/edit#gid=288817690</a></p><p class="MsoNormal"> </p><p class="MsoNormal">Thanks</p><p class="MsoNormal">-kiru</p><div class="MsoNormal" align="center" style="text-align:center"><span style="font-size:12.0pt;font-family:"Times New Roman","serif""><hr size="1" width="100%" noshade style="color:#7f7f7f" align="center"></span></div><p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:12.0pt;font-family:"Times New Roman","serif"">No virus found in this message.<br>Checked by AVG - <a href="http://www.avg.com">www.avg.com</a><br>Version: 2015.0.6125 / Virus Database: 4409/10565 - Release Date: 09/03/15</span></p><div class="MsoNormal" align="center" style="text-align:center"><span style="font-size:12.0pt;font-family:"Times New Roman","serif""><hr size="1" width="100%" noshade style="color:#a0a0a0" align="center"></span></div><p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:12.0pt;font-family:"Times New Roman","serif"">No virus found in this message.<br>Checked by AVG - <a href="http://www.avg.com">www.avg.com</a><br>Version: 2015.0.6086 / Virus Database: 4409/10558 - Release Date: 09/01/15</span></p></div></body></html>