[Openstack-security] [Bug 1572719] Re: Client may hold socket open after ChunkWriteTimeout

Tristan Cacqueray tdecacqu at redhat.com
Wed Oct 5 07:12:53 UTC 2016


I was able to reproduce that issue on Mitaka (openstack-
swift-2.7.0-2.el7.noarch and python2-eventlet-0.17.4-4.el7.noarch) using
python socket module to open many GET connections without consuming the
response.

Extracts from swift logs:
proxy-server[14729]: ERROR with Object server 192.168.240.13:6000/swiftloopback re: Trying to GET /v1/AUTH_e90d6d7d952f4a6d9933047b461015ff/test/large_file: Timeout (10.0s) (txn: tx706ead3ee0d64af2ba382-0057f4691c)
proxy-server[14729]: Object GET returning 503 for [] (txn: tx706ead3ee0d64af2ba382-0057f4691c) (client_ip: 192.168.240.14)

After enough connections have failed with this 503 errors (about 8000
for my all-in-one instance), all swift proxy-server processes are
unresponsive until the offending client is terminated, or the service
restarted.

-- 
You received this bug notification because you are a member of OpenStack
Security, which is subscribed to OpenStack.
https://bugs.launchpad.net/bugs/1572719

Title:
  Client may hold socket open after ChunkWriteTimeout

Status in OpenStack Security Advisory:
  Incomplete
Status in OpenStack Object Storage (swift):
  New

Bug description:
  You can reproduce this by issuing a GET request for a few hundred MB
  file and never consuming the response, but keep the client socket
  open. Swift will log a 499 but the socket does not always close.

  ChunkWriteTimeout is meant to protect the proxy from a slow reading client:
    https://github.com/openstack/swift/blob/master/swift/proxy/controllers/base.py#L889-L905

  Sometimes when this exception is thrown there is still data in the process socket buffer, so when eventlet tries to close the socket it first flushes it:
    https://github.com/eventlet/eventlet/blob/master/eventlet/wsgi.py#L631
    https://hg.python.org/cpython/file/v2.7.11/Lib/SocketServer.py#l711

  The problem is that if the client is not consuming the socket buffer
  then that flush will wait forever; it's trying to write on a socket
  that just threw a timeout trying to write! The flush write is not
  protected by any timeout.

  As far as I can tell, this WRITE_TIMEOUT does nothing:
    https://github.com/openstack/swift/blob/master/swift/common/wsgi.py#L407

  wsgi.server() takes a socket_timeout that might be what we're after?
    https://github.com/openstack/swift/blob/master/swift/common/wsgi.py#L437-L440

  Even with socket_timeout, eventlet needs to be patched. This should be in a finally block:
    https://github.com/eventlet/eventlet/blob/master/eventlet/wsgi.py#L636-L637

  All of this is probably mitigated by most operators setting an idle
  timeout in their load balancers, but I wanted to report it. Going
  directly to a proxy I was able to hold sockets open for long periods
  of time.

  I did the initial research on version 2.2.2 but I was able to
  reproduce on 2.7.0. I'm trying to translate links to master branch on
  github. I apologize in advance if it's not quite right.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ossa/+bug/1572719/+subscriptions




More information about the Openstack-security mailing list