[openstack-dev] [all] Replace mysql-python with mysqlclient

Robert Collins robertc at robertcollins.net
Mon May 11 21:25:07 UTC 2015


On 10 May 2015 at 03:26, John Garbutt <john at johngarbutt.com> wrote:
> On 9 May 2015 at 15:02, Mike Bayer <mbayer at redhat.com> wrote:
>> On 5/9/15 6:45 AM, John Garbutt wrote:
>>>
>>> I am leaning towards us moving to making DB calls with a thread pool and
>>> some fast C based library, so we get the 'best' performance. Is that a crazy
>>> thing to be thinking? What am I missing here? Thanks, John

So 'best' performance, and the number of processes we have are all
tied together.

tl;dr: the number of Python processes required to handle a concurrency
of N requests for a service is given by
N*(1-safety_factor) *
avg_request_cpu_use/(avg_request_cpu_use+avg_request_time_blocking)
When requests are CPU bound, you need one process per concurrent request.
When requests are IO bound, you can multiplex requests into a process,
until the sum of the CPU work per second exceeds your safety factor
(which I like to keep down around 0.8 to leave leeway for bursts).

Threads don't help this at all. They don't hinder it either (broadly
speaking - Mike has very specific performance metrics that show the
overheads within the system of different  multiplexing approachs).
Threads are useful for dealing with things that expect threads, like
most DB libraries. Using a thread pool is fine, but don't expect it to
alter the fundamentals around how many processes we need.

Details: Skip over this bit if you know it all already.

The GIL plays a big factor here: if you want to scale the amount of
CPU available to a Python service, you have two routes:
A) move work to a different process through some RPC - be that DB's
using SQL, other services using oslo.messaging or HTTP - whatever.
B) use C extensions to perform work in threads - e.g. openssl context
processing.

To increase concurrency you can use threads, eventlet, asyncio,
twisted etc - because within a single process *all* Python bytecode
execution happens inside the GIL lock, so you get at most one CPU for
a CPU bound workload. For an IO bound workload, you can fit more work
in by context switching within that one CPU capacity. And - the GIL is
a poor scheduler, so at the limit - an IO bound workload where the IO
backend has more capacity than we have CPU to consume it within our
process, you will run into priority inversion and other problems.
[This varies by Python release too].

request_duration = time_in_cpu + time_blocked
request_cpu_utilisation = time_in_cpu/request_duration
cpu_utilisation = concurrency * request_cpu_utilisation

Assuming that we don't want any one process to spend a lot of time at
100% - to avoid such at-the-limit issues, lets pick say 80%
utilisation, or a safety factor of 0.2. If a single request consumes
50% of its duration waiting on IO, and 50% of its duration executing
bytecode, we can only run one such request concurrently without
hitting 100% utilisations. (2*0.5 CPU == 1). For a request that spends
75% of its duration waiting on IO and 25% on CPU, we can run 3 such
requests concurrently without exceeding our target of 80% utilisation:
(3*0.25=0.75).

What we have today in our standard architecture for OpenStack is
optimised for IO bound workloads: waiting on the
network/subprocesses/disk/libvirt etc. Running high numbers of
eventlet handlers in a single process only works when the majority of
the work being done by a handler is IO.

For some of our servers, e.g. Nova-compute, where we're spending a lot
of time waiting on the DB (via the conductor), or libvirt, or VMWare
callouts etc - this makes a lot of sense. In fact its nearly ideal:
we're going to spend stuff all time executing bytecode, and the
majority of time waiting.

For other servers, e.g. heat-engine or murano, were we are doing
complex processing of the state that was stored in the persistent
store backing the system, that ratio is going to change dramatically.

And for some, like nova-conductor, the better and faster we make the
DB layer, the less time we spend blocked, and the *less* concurrency
we can support in a single process. (But hopefully the less
concurrency that is needed, for a given workload).

So - a thread pool doesn't help with the number of

>> I'd like to do that but I want the whole Openstack DB API layer in the
>> thread pool, not just the low level DBAPI (Python driver) calls.   There's
>> no need for eventlet-style concurrency or even less for async-style
>> concurrency in transactionally-oriented code.
>
> Sorry, not sure I get which DB API is which.
>
> I was thinking we could dispatch all calls to this API into a thread pool:
> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py

That would work I think.

> I guess an alternative is to add this in the objects layer, on top of
> the rpc dispatch:
> https://github.com/openstack/nova/blob/master/nova/objects/base.py#L188
> But that somehow feels like a layer violation, maybe its not.

No opinion here, sorry :)

-Rob


-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list