[openstack-dev] [all] Replace mysql-python with mysqlclient

Mike Bayer mbayer at redhat.com
Tue May 12 14:24:53 UTC 2015



On 5/11/15 9:17 PM, Robert Collins wrote:
> On 12 May 2015 at 10:44, Mike Bayer <mbayer at redhat.com> wrote:
>
>>> What we have today in our standard architecture for OpenStack is
>>> optimised for IO bound workloads: waiting on the
>>> network/subprocesses/disk/libvirt etc. Running high numbers of
>>> eventlet handlers in a single process only works when the majority of
>>> the work being done by a handler is IO.
>>
>> Everything stated here is great, however in our situation there is one
>> unfortunate fact which renders it completely incorrect at the moment.   I'm
>> still puzzled why we are getting into deep think sessions about the vagaries
>> of the GIL and async when there is essentially a full-on red-alert
>> performance blocker rendering all of this discussion useless, so I must
>> again remind us: what we have *today* in Openstack is *as completely
>> un-optimized as you can possibly be*.
> Sorry if I seems like I went on a tangent, but choosing a concurrency
> model in Python, which a lot of this discussion has been about, is
> inextricably linked to the workload being tackled. The point of my
> tl;dr was that using threads - which gets us out of the pit below - is
> fine for most of our workloads and irrelevant to the actual issues in
> the other ones. Clearly that didn't come across. - Sorry.
Robert -

Other people noted my fast takeoff as well so i think I saw "GIL" and 
lots of thoughtful calculations and after that, my reading comprehension 
is dulled by the fog of my own angst :).    I'll try to slow down more 
next time.


>> The most GIL-heavy nightmare CPU bound task you can imagine running on 25
>> threads on a ten year old Pentium will run better than the Openstack we have
>> today, because we are running a C-based, non-eventlet patched DB library
>> within a single OS thread that happens to use eventlet, but the use of
>> eventlet is totally pointless because right now it blocks completely on all
>> database IO.
> To confirm my understanding: this library releases the GIL, but
> because we only have one thread, we don't get more work done.
>
> Yes, that sucks. And your tl;dr is that we need to either use an
> eventlet ready library or not use eventlet's greenthreads, either of
> which I support as a short term rectification.
yes, the GIL is released within the MySQLdb C routines that are 
primarily focused on IO here.


>
>> Robert's analysis talks about various "at the limit" issues,  but I was
> They tend to turn up at scale. You get 100 requests a day out of 5
> million that are inexplicably slow, and eventually you have enough
> data around the situation to try an experiment, and lo and behold the
> problem goes away. They don't disagree with the argument you're making
> though - this is just the bigger context, when folk go to deploy our
> (real threads || eventlet friendly DB library) code, how many
> processes will they need?
It's been pointed out separately that Openstack already uses a lot of 
processes, and even now with our serialized DB access per-process we 
still achieve concurrency through this.  So by all means, let's keep 
using processes, that is always a good thing although it does present 
the challenge that we have a lot of DB connections opened as a result 
(because we use pooling).


>
> FWIW, I think moving to an eventlet friendly library should be the
> first step because it can be done much more rapidly and with arguably
> less risk.
Yes I'm not really sure why we aren't just changing "mysql+mysqldb://" 
to "mysql+pymysql://" in our config files right now.   Because this 
would also solve the Py3K issue for the time being.





More information about the OpenStack-dev mailing list