[openstack-dev] [all] Replace mysql-python with mysqlclient

John Garbutt john at johngarbutt.com
Sat May 9 10:45:26 UTC 2015


On 30 April 2015 at 18:54, Mike Bayer <mbayer at redhat.com> wrote:
> On 4/30/15 11:16 AM, Dan Smith wrote:
>>> There is an open discussion to replace mysql-python with PyMySQL, but
>>> PyMySQL has worse performance:
>>>
>>> https://wiki.openstack.org/wiki/PyMySQL_evaluation
>>
>> My major concern with not moving to something different (i.e. not based
>> on the C library) is the threading problem. Especially as we move in the
>> direction of cellsv2 in nova, not blocking the process while waiting for
>> a reply from mysql is going to be critical. Further, I think that we're
>> likely to get back a lot of performance from a supports-eventlet
>> database connection because of the parallelism that conductor currently
>> can only provide in exchange for the footprint of forking into lots of
>> workers.
>>
>> If we're going to move, shouldn't we be looking at something that
>> supports our threading model?
>
> yes, but at the same time, we should change our threading model at the level
> of where APIs are accessed to refer to a database, at the very least using a
> threadpool behind eventlet.   CRUD-oriented database access is faster using
> traditional threads, even in Python, than using an eventlet-like system or
> using explicit async.  The tests at
> http://techspot.zzzeek.org/2015/02/15/asynchronous-python-and-databases/
> show this.    With traditional threads, we can stay on the C-based MySQL
> APIs and take full advantage of their speed.

Sorry to go back in time, I wanted to go back to an important point.

It seems we have three possible approaches:
* C lib and eventlet, blocks whole process
* pure python lib, and eventlet, eventlet does its thing
* go for a C lib and dispatch calls via thread pool

We have a few problems:
* performance sucks, we have to fork lots of nova-conductors and api nodes
* need to support python2.7 and 3.4, but its not currently possible
with the lib we use?
* want to pick a lib that we can fix when there are issues, and work to improve

It sounds like:
* currently do the first one, it sucks, forking nova-conductor helps
* seems we are thinking the second one might work, we sure get py3.4 +
py2.7 support
* the last will mean more work, but its likely to be more performant
* worried we are picking a unsupported lib with little future

I am leaning towards us moving to making DB calls with a thread pool
and some fast C based library, so we get the 'best' performance.

Is that a crazy thing to be thinking? What am I missing here?

Thanks,
John



More information about the OpenStack-dev mailing list