[Openstack-operators] DB deadlocks due to connection string

Mike Bayer mbayer at redhat.com
Tue May 23 20:38:38 UTC 2017



On 05/23/2017 04:17 PM, Doug Hellmann wrote:
> 
> 
>> On May 23, 2017, at 4:01 PM, Doug Hellmann <doug at doughellmann.com 
>> <mailto:doug at doughellmann.com>> wrote:
>>
>> Excerpts from Sean McGinnis's message of 2017-05-23 11:38:34 -0500:
>>>>
>>>> This sounds like something we could fix completely by dropping the
>>>> use of the offending library. I know there was a lot of work done
>>>> to get pymysql support in place. It seems like we can finish that by
>>>> removing support for the old library and redirecting mysql://
>>>> connections to use pymysql.
>>>>
>>>> Doug
>>>>
>>>
>>> I think that may be ideal. If there are known issues with the library,
>>> and we have a different and well tested alternative that we know works,
>>> it's probably easier all around to just redirect internally to use
>>> pymysql.
>>
>> Now we just have to find the code that's doing the mapping to the
>> driver. It doesn't seem to be oslo.db. Is it sqlalchemy?
> 
> Mike, do you have any insight into the best approach for this?

OK so the way this works is:

1. if you are using SQLAlchemy by itself, and you send a URL that is 
"mysql://user:pass@host/dbname", that omits the "+driver" portion; a 
default driver is selected, in the case of MySQL it uses the driver that 
imports under "import mysqldb".  This is either the mysqlclient or the 
older Python-MySQL which it replaces; these are native drivers written 
in C and the eventlet monkeypatching we use does not manage to modify 
these to act in a non-blocking fashion, so you get new kinds of 
deadlocks you wouldn't normally get when using eventlet.

2. If you send a URL like we want nowadays, 
"mysql+pymsql://user:pass@hsot/dbname", you get the pure-Python PyMySQL 
driver we've standardized upon, which works under eventlet 
monkeypatching and you don't get weird deadlocks of this nature.

3. The database URLs are inside of the .conf files for all the services, 
individually, such as nova.conf, neutron.conf, etc.   These got there 
based on the installer that one used, and from that point on I don't 
think they change (it's possible that installers like tripleo might be 
able to alter the files when you do an upgrade).

So the reason things "work" for people is that their installation / 
upgrade process has ensured that a MySQL database URL for a process that 
uses eventlet is of the form "mysql+pymysql://".   If that hasn't 
happened somewhere, then we'd have this problem.   I think the problem 
first and foremost needs to be "fixed" at this layer, e.g. 
"mysql+pymsql://" should preferably be explicit for as long as we use 
pure SQLAlchemy database URLs in our config files (e.g. these should 
either be fully correct URLs, or we shouldn't be using URLs if some 
other layer makes decisions about the database connection string).

On the "database management" side of things, e.g. projects that use 
oslo.db, we can look into failing an immediate assertion if a database 
backend but no driver is specified, e.g. to disable SQLAlchemy's usual 
"default driver" selection logic.   This is the minimum we should 
probably do here, however this will make existing installations that 
"sort of work" right now to "not work" at all until the configuration is 
fixed.

If we truly want to implicitly force the driver to be "pymysql" if 
"mysql" is present without a driver, we can do that also, but that feels 
kind of wrong to me; there are all kinds of things that might need to 
happen to database URLs and it would be unfortunate if we started just 
hardcoding driver decisions in oslo.db without solving the issue of 
configuration in a more general sense, not to mention it's misleading to 
continue to have full SQLAlchemy URLs in the conf files that get 
silently altered by a middle tier - better would be that the format of 
the database configuration changes to not be confused with this.   More 
flexible would be if there were some kind of "registry"-oriented 
configuration so that connection URLs across many services could be more 
centralized (this is how ODBC works for example), but that is also 
another layer of complexity.

I'd mostly want to understand how we have "mysql://" URLs floating 
around as the installers / upgraders should have taken care of that 
issue some time ago.  Otherwise we shouldn't have full database URLs in 
our .conf files if a middle layer is going to silently change them anyway.

Simplest fix here is of course if someone has the old style URLs, just 
fix them to be "correct".  I'm mostly comfortable with adding assertions 
for this but not as much silently "fixing" URLs.






More information about the OpenStack-operators mailing list