[openstack-dev] [nova] [all] Excessively high greenlet default + excessively low connection pool defaults leads to connection pool latency, timeout errors, idle database connections / workers

Chris Friesen chris.friesen at windriver.com
Fri Jan 8 10:46:08 UTC 2016


On 01/07/2016 05:44 PM, Mike Bayer wrote:
>
>
> On 01/07/2016 07:39 AM, Clayton O'Neill wrote:
>> On Thu, Jan 7, 2016 at 2:49 AM, Roman Podoliaka <rpodolyaka at mirantis.com> wrote:
>>>
>>> Linux gurus please correct me here, but my understanding is that Linux
>>> kernel queues up to $backlog number of connections *per socket*. In
>>> our case child processes inherited the FD of the socket, so they will
>>> accept() connections from the same queue in the kernel, i.e. the
>>> backlog value is for *all* child processes, not *per* process.
>>
>>
>> Yes, it will be shared across all children.
>>
>>>
>>> In each child process eventlet WSGI server calls accept() in a loop to
>>> get a client socket from the kernel and then puts into a greenlet from
>>> a pool for processing:
>>
>> It’s worse than that.  What I’ve seen (via strace) is that eventlet actually
>> converts socket into a non-blocking socket, then converts that accept() into a
>> epoll()/accept() pair in every child.  Then when a connection comes in, every
>> child process wakes up out of poll and races to try to accept on the the
>> non-blocking socket, and all but one of them fails.
>
> is that eventlet-specific or would we see the same thing in gevent ?

If you've got multiple processes all doing select()/poll()/epoll()/etc on a 
single socket that has become readable, you're going to run into this sort of 
thundering herd problem unless you have a separate mechanism to serialize things.

Chris



More information about the OpenStack-dev mailing list