[openstack-dev] [nova] [all] Excessively high greenlet default + excessively low connection pool defaults leads to connection pool latency, timeout errors, idle database connections / workers

Clayton O'Neill clayton at oneill.net
Thu Jan 7 12:39:51 UTC 2016

On Thu, Jan 7, 2016 at 2:49 AM, Roman Podoliaka <rpodolyaka at mirantis.com> wrote:
> Linux gurus please correct me here, but my understanding is that Linux
> kernel queues up to $backlog number of connections *per socket*. In
> our case child processes inherited the FD of the socket, so they will
> accept() connections from the same queue in the kernel, i.e. the
> backlog value is for *all* child processes, not *per* process.

Yes, it will be shared across all children.

> In each child process eventlet WSGI server calls accept() in a loop to
> get a client socket from the kernel and then puts into a greenlet from
> a pool for processing:

It’s worse than that.  What I’ve seen (via strace) is that eventlet actually
converts socket into a non-blocking socket, then converts that accept() into a
epoll()/accept() pair in every child.  Then when a connection comes in, every
child process wakes up out of poll and races to try to accept on the the
non-blocking socket, and all but one of them fails.

This means that every time there is a request, every child process is woken
up, scheduled on CPU and then put back to sleep.  This is one of the
reasons we’re (slowly) moving to uWSGI.

