[openstack-dev] [nova] [all] Excessively high greenlet default + excessively low connection pool defaults leads to connection pool latency, timeout errors, idle database connections / workers
mbayer at redhat.com
Thu Jan 7 16:55:41 UTC 2016
On 01/07/2016 11:02 AM, Sean Dague wrote:
> On 01/07/2016 09:56 AM, Brant Knudson wrote:
>> On Thu, Jan 7, 2016 at 6:39 AM, Clayton O'Neill <clayton at oneill.net
>> <mailto:clayton at oneill.net>> wrote:
>> On Thu, Jan 7, 2016 at 2:49 AM, Roman Podoliaka
>> <rpodolyaka at mirantis.com <mailto:rpodolyaka at mirantis.com>> wrote:
>> > Linux gurus please correct me here, but my understanding is that Linux
>> > kernel queues up to $backlog number of connections *per socket*. In
>> > our case child processes inherited the FD of the socket, so they will
>> > accept() connections from the same queue in the kernel, i.e. the
>> > backlog value is for *all* child processes, not *per* process.
>> Yes, it will be shared across all children.
>> > In each child process eventlet WSGI server calls accept() in a loop to
>> > get a client socket from the kernel and then puts into a greenlet from
>> > a pool for processing:
>> It’s worse than that. What I’ve seen (via strace) is that eventlet
>> converts socket into a non-blocking socket, then converts that
>> accept() into a
>> epoll()/accept() pair in every child. Then when a connection comes
>> in, every
>> child process wakes up out of poll and races to try to accept on the the
>> non-blocking socket, and all but one of them fails.
>> This means that every time there is a request, every child process
>> is woken
>> up, scheduled on CPU and then put back to sleep. This is one of the
>> reasons we’re (slowly) moving to uWSGI.
>> I just want to note that I've got a change proposed to devstack that
>> adds a config option to run keystone in uwsgi (rather than under
>> eventlet or in apache httpd mod_wsgi), see
>> https://review.openstack.org/#/c/257571/ . It's specific to keystone
>> since I didn't think other projects were moving away from eventlet, too.
> I feel like this is a confused point that keeps being brought up.
> The preferred long term direction of all API services is to be deployed
> on a real web server platform. It's a natural fit for those services as
> they are accepting HTTP requests and doing things with them.
> Most OpenStack projects have worker services beyond just an HTTP server.
> (Keystone is one of the very few exceptions here). Nova has nearly a
> dozen of these worker services. These don't naturally fit as wsgi apps,
> they are more traditional daemons, which accept requests over the
> network, but also have periodic jobs internally and self initiate
> actions. They are not just call / response. There is no long term
> direction for these to move off of eventlet.
This is totally speaking as an outsider without taking into account all
the history of these decisions, but the notion of "Python + we're a
daemon" == "we must use eventlet" seems a little bit rigid. Also, the
notion of "we have background tasks" == "we cant run in a web server",
also not clear. If a service intends to serve HTTP requests, that
portion of that service should be deployed in a web server; if the
system has other "background tasks", ideally those are in a separate
daemon altogether, but also even if you're under something like
mod_wsgi, you can spawn a child process or worker thread regardless.
You always have a Python interpreter running and all the things it can do.
More information about the OpenStack-dev