[openstack-dev] [nova] [all] Excessively high greenlet default + excessively low connection pool defaults leads to connection pool latency, timeout errors, idle database connections / workers

Roman Podoliaka rpodolyaka at mirantis.com
Tue Feb 23 11:25:35 UTC 2016

Hi all,

I've taken another look at this in order to propose patches to
oslo.service/oslo.db, so that we have better defaults for WSGI
greenlets number / max DB connections overflow [1] [2], which would be
more suitable for DB oriented services like our APIs are.

I used the Mike's snippet [3] for testing, 10 workers (i.e. forks)
served the WSGI app, ab concurrency level was set to 100, 3000
requests were sent.

With our default settings (1000 greenlets per worker, 5 connections in
the DB pool, 10 connections max overflow, 30 seconds timeout waiting
for a connection to become available), ~10-15 requests out of 3000
will fail with 500 due to pool timeout issue on every run [4].

As it was expected, load is distributed unevenly between workers: htop
shows that one worker is busy, while others are not [5]. Tracing
accept() calls with perf-events (sudo perf trace -e accept --pid=$PIDS
-S) allows to see the exact number of requests served by each worker
[6] - we can see that the "busy" worker served almost twice as many
WSGI requests as any other worker did. perf output [7] shows an
interesting pattern: each eventlet WSGI worker sleeps in accept()
waiting for new connections to become available in the queue handled
by the kernel; when there is a new connection available, a random
worker wakes up and tries to accept() as many connections as possible.

Reading the source code of eventlet WSGI server [8] suggests that it
will accept() new connections as long as they are available (and as
long as there are more available greenthreads in the pool) before
starting to process already accept()'ed ones (spawn_n() only creates a
new greenthread and schedules it be executed "later"). Giving the fact
we have 1000 greenlets in the pool, there is a high probability we'll
end up with an overloaded worker. If handling of these requests
involves doing DB queries, we have only 5 (pool) + 10 (max overflow)
DB connections available, others will have to wait (and may eventually
time out after 30 seconds).

So looks like it's two related problems here:

1) the distribution of load between workers is uneven. One way to fix
this is to decrease the default number of greenlets in pool [2], which
will effectively cause a particular worker to give up new connections
to other forks, as soon as there are no more greenlets available in
the pool to process incoming requests. But this alone will *only* be
effective when the concurrency level is greater than the number of
greenlets in pool. Another way would be to add a context switch to
eventlet accept() loop [8] right after spawn_n() - this is what I've
got with greenthread.sleep(0.05) [9][10] (the trade off is that we now
only can accept() 1/ 0.05 = 20 new connections per second per worker -
I'll try to experiment with numbers here).

2) even if the distribution of load is even, we still have to be able
to process requests according to the max level of concurrency,
effectively set by the number of greenlets in pool. For DB oriented
services that means we need to have DB connections available. [1]
increases the
default max_overflow value to allow SQLAlchemy to open additional
connections to a DB and handle spikes of concurrent requests.
Increasing max_overflow value further will probably lead to max number
of connection errors in RDBMs servers.

As it was already mentioned in this thread, the rule of thumb is that
for DB oriented WSGI services the max_overflow value should be at
least close to the number of greenlets. Running tests on my machine
shows that having 100 greenlets in pool / 5 DB connections in pool /
50 max_overflow / 30 seconds pool timeout allows to handle up to 500
concurrent requests without seeing pool timeout errors.


[1] https://review.openstack.org/#/c/269186/
[2] https://review.openstack.org/#/c/269188/
[3] https://gist.github.com/zzzeek/c69138fd0d0b3e553a1f
[4] http://paste.openstack.org/show/487867/
[5] http://imgur.com/vEWJmrd
[6] http://imgur.com/FOZ2htf
[7] http://paste.openstack.org/show/487871/
[8] https://github.com/eventlet/eventlet/blob/master/eventlet/wsgi.py#L862-L869
[9] http://paste.openstack.org/show/487874/
[10] http://imgur.com/IuukDiD

On Mon, Jan 11, 2016 at 4:05 PM, Mike Bayer <mbayer at redhat.com> wrote:
> On 01/11/2016 05:39 AM, Radomir Dopieralski wrote:
>> On 01/08/2016 09:51 PM, Mike Bayer wrote:
>>> On 01/08/2016 04:44 AM, Radomir Dopieralski wrote:
>>>> On 01/07/2016 05:55 PM, Mike Bayer wrote:
>>>>> but also even if you're under something like
>>>>> mod_wsgi, you can spawn a child process or worker thread regardless.
>>>>> You always have a Python interpreter running and all the things it can
>>>>> do.
>>>> Actually you can't, reliably. Or, more precisely, you really shouldn't.
>>>> Most web servers out there expect to do their own process/thread
>>>> management and get really embarrassed if you do something like this,
>>>> resulting in weird stuff happening.
>>> I have to disagree with this as an across-the-board rule, partially
>>> because my own work in building an enhanced database connection
>>> management system is probably going to require that a background thread
>>> be running in order to reap stale database connections.   Web servers
>>> certainly do their own process/thread management, but a thoughtfully
>>> organized background thread in conjunction with a supporting HTTP
>>> service allows this to be feasible.   In the case of mod_wsgi,
>>> particularly when using mod_wsgi in daemon mode, spawning of threads,
>>> processes and in some scenarios even wholly separate applications are
>>> supported use cases.
>> [...]
>>> It is certainly reasonable that not all web application containers would
>>> be effective with apps that include custom background threads or
>>> processes (even though IMO any system that's running a Python
>>> interpreter shouldn't have any issues with a limited number of
>>> well-behaved daemon-mode threads), but at least in the case of mod_wsgi,
>>> this is supported; that gives Openstack's HTTP-related applications with
>>> carefully/thoughtfully organized background threads at least one
>>> industry-standard alternative besides being forever welded to its
>>> current homegrown WSGI server implementation.
>> This is still writing your application for a specific configuration of a
>> specific version of a specific implementation of the protocol on a
>> specific web server. While this may work as a stopgap solution, I think
>> it's a really bad long-term strategy. We should be programming for a
>> protocol specification (WSGI in this case), not for a particular
>> implementation (unless we need to throw in workarounds for
>> implementation bugs).
> That is fine, but then you are saying that all of those aforementioned
> Nova services which do in fact use WSGI with its own homegrown eventlet
> server should nevertheless be rewritten to not use any background
> threads, which I also presented as the ideal choice.   Right now, the
> fact that these Nova services use background threads is being used as a
> justification for why these services can never move to use a proper web
> server, even though they are still WSGI apps running inside of a WSGI
> container, so they are already doing the thing that claims to prevent
> this move from being possible.
> Also, mod_wsgi's compatibility with background threads is not linked to
> a "specific version", it's intrinsic in the organization of the product.
>   I would wager that most other WSGI containers can probably handle this
> use case as well but this would need to be confirmed.
>> At least it seems so to my naive programmer mind. Sorry for ranting,
>> I'm sure that you are aware of the trade-off here.
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list