[openstack-dev] [nova] [all] Excessively high greenlet default + excessively low connection pool defaults leads to connection pool latency, timeout errors, idle database connections / workers
chris.friesen at windriver.com
Fri Jan 8 10:46:08 UTC 2016
On 01/07/2016 05:44 PM, Mike Bayer wrote:
> On 01/07/2016 07:39 AM, Clayton O'Neill wrote:
>> On Thu, Jan 7, 2016 at 2:49 AM, Roman Podoliaka <rpodolyaka at mirantis.com> wrote:
>>> Linux gurus please correct me here, but my understanding is that Linux
>>> kernel queues up to $backlog number of connections *per socket*. In
>>> our case child processes inherited the FD of the socket, so they will
>>> accept() connections from the same queue in the kernel, i.e. the
>>> backlog value is for *all* child processes, not *per* process.
>> Yes, it will be shared across all children.
>>> In each child process eventlet WSGI server calls accept() in a loop to
>>> get a client socket from the kernel and then puts into a greenlet from
>>> a pool for processing:
>> It’s worse than that. What I’ve seen (via strace) is that eventlet actually
>> converts socket into a non-blocking socket, then converts that accept() into a
>> epoll()/accept() pair in every child. Then when a connection comes in, every
>> child process wakes up out of poll and races to try to accept on the the
>> non-blocking socket, and all but one of them fails.
> is that eventlet-specific or would we see the same thing in gevent ?
If you've got multiple processes all doing select()/poll()/epoll()/etc on a
single socket that has become readable, you're going to run into this sort of
thundering herd problem unless you have a separate mechanism to serialize things.
More information about the OpenStack-dev