[openstack-dev] profiling Latency of single PUT operation on proxy + storage

Kirubakaran Kaliannan kirubak at zadarastorage.com
Tue Sep 8 11:33:56 UTC 2015


Hi All,



I have attached a simple timeline of proxy+object latency chart for a
single PUT request. Please check.



I am profiling the swift proxy + object server to improve the latency of
single PUT request. This may help to improve the overall OPS performance.

Test Configuration : 4CPU + 16GB + 1 proxy node + 1 storage node + 1
replica for object ring, 3 replica for container ring on SSD;  perform 4k
PUT (one-by-one) request.

Every 4K PUT request in the above case takes 22ms (30ms for 3 replica-count
for object). Target is to bring the per 4K put request below 10ms to double
the overall OPS performance.



There are some potential places where we can improve the latency to achieve
this. Can you please provide your thoughts.



*Performance optimization-1: *Proxy server don’t have to get blocked in
connect() - getexpect() until object-server responds.

*Problem Today: *On PUT request, the proxy server connect_put_node() wait
for the response from the object server (getexpect()) after the connection
is established. Once the response (‘HTTP_CONTINUE’) is received, the proxy
server goes ahead and spawn the send_file thread to send data to object
server’s. There code looks serialized between proxy and object server.

*Optimization*:

*Option1:* Avoid waiting for all the connect to complete before proceeding
with the send_data to the connected object-server’s ?

*Option2:* The purpose of the getexpect() call is not very clear. Can we
relax  this, so that the proxy server will go-ahead read the data_source
and send it to the object server quickly after the connection is
established. We may have to handle extra failure cases here. (FYI: This
reduce 3ms for a single PUT request ).

    def _connect_put_node(self, nodes, part, path, headers,

                          logger_thread_locals, req):

        """Method for a file PUT connect"""

       ………..
       *with Timeout(self.app.node_timeout):*

*                    resp = conn.getexpect()*

       ………



*Performance Optimization-2*: Object server serialize the container_update
after the data write.

*Problem Today:* On PUT request, the object server, after writing the data
and meta data, the container_update() is called, which is serialized to all
storage nodes (3 way). Each container update take 3 millisecond and it adds
to 9 millisecond for the container_update to complete.

*Optimization:* Can we make this parallel using the green thread, and
probably *return success on  the first successful container update*, if
there is no connection error? I am trying to understand whether this will
have any data integrity issues, can you please provide your feed back on
this ?

*(FYI:* this reduce atlest 5 millisecond)



*Performance Optimization-3*:  write(metadata) in object server takes 2 to
3 millisecond

*Problem today:* After writing the data to the file, writer.put(metadata)
-> _*finalize*_put() to process the post write operation. This takes an
average of 3 millisecond for every put request.

*Optimization:*

*Option 1:* Is it possible to flush the file (group of files)
asynchronously in _*finalize*_put()

*Option 2:* Can we make this put(metadata) an asynchronous call ? so the
container update can happen in parallel ?  Error conditions must be handled
properly.



I would like to know, whether we have done any work done in this area, so
not to repeat the effort.



The motivation for this work, is because 30millisecond for a single 4K I/O
looks too high. With this the only way to scale is to put more server’s.
Trying to see whether we can achieve anything quickly to modify some
portion of code  or this may require quite a bit of code-rewrite.



Also, suggest whether this approach/work on reducing latency of 1 PUT
request is correct ?





Thanks

-kiru



*From:* Shyam Kaushik [mailto:shyam at zadarastorage.com
<shyam at zadarastorage.com>]
*Sent:* Friday, September 04, 2015 11:53 AM
*To:* Kirubakaran Kaliannan
*Subject:* RE: profiling per I/O logs



*Hi Kiru,*



I listed couple of optimization options like below. Can you pls list down
3-4 optimizations like below in similar format & pass it back to me for a
quick review. Once we finalize lets bounce it with community on what they
think.



*Performance optimization-1:* Proxy-server - on PUT request drive client
side independent of auth/object-server connection establishment

*Problem today:* on PUT request, client connects/puts header to
proxy-server. Proxy-server goes to auth & then looks up ring, connects to
each of object-server sending a header. Then when object-servers accept the
connection, proxy-server sends HTTP continue to client & now client writes
data into proxy-server & then proxy-server writes data to the object-servers

*Optimization:* Proxy-server can drive the client side independent of
backend side. i.e. upon auth completes, proxy-server through a thread can
send HTTP continue to client & ask for the data to be written. In the
background it can try to connect to object-server writing the header. This
way when the backend side is doing work, parallel work is done at proxy
front-end thereby reducing latency on the overall IO processing



<<Can you pls confirm if this is the case>>

*Performance optimization-2:* Proxy does TCP connect/disconnect on every
PUT to object-server & similarly object-server to container-server updates

*Problem today:* swift/common/bufferedhttp.py does TCP connect for every
BufferedHTTPConnection::connect().

*Optimization:* Maintain TCP connection pool below bufferedhttp.py.
refcounted pool. Connection pool manager periodically cleans up
unreferenced connections. Re-use past tcp connections for quicker
HTTPConnection



--Shyam



*From:* Kirubakaran Kaliannan [mailto:kirubak at zadarastorage.com]
*Sent:* Thursday, September 03, 2015 3:10 PM
*To:* Shyam Kaushik
*Subject:* profiling per I/O logs





Hi Shyam,



You can look at the directory



/mnt/work/kirubak/profile/perio/*



This is for single object replica with 3 way container.



The list of potential identified task that we can work with community are
(in issues section – P1, O1, O2 – which we can discuss)



https://docs.google.com/spreadsheets/d/1We577s7CQRfq2RmpPCN04kEHc8HD_4g_54ELPn-F0g0/edit#gid=288817690



Thanks

-kiru
------------------------------

No virus found in this message.
Checked by AVG - www.avg.com
Version: 2015.0.6125 / Virus Database: 4409/10565 - Release Date: 09/03/15
------------------------------

No virus found in this message.
Checked by AVG - www.avg.com
Version: 2015.0.6086 / Virus Database: 4409/10558 - Release Date: 09/01/15
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150908/009026bb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perf_timeline_low.png
Type: image/png
Size: 28302 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150908/009026bb/attachment.png>


More information about the OpenStack-dev mailing list