[openstack-dev] [all] [ha] potential issue with implicit async-compatible mysql drivers
Ihar Hrachyshka
ihrachys at redhat.com
Fri Dec 12 14:27:21 UTC 2014
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Reading the latest comments at
https://github.com/PyMySQL/PyMySQL/issues/275, it seems to me that the
issue is not to be solved in drivers themselves but instead in
libraries that arrange connections (sqlalchemy/oslo.db), correct?
Will the proposed connection reopening help?
/Ihar
On 05/12/14 23:43, Mike Bayer wrote:
> Hey list -
>
> I’m posting this here just to get some ideas on what might be
> happening here, as it may or may not have some impact on Openstack
> if and when we move to MySQL drivers that are async-patchable, like
> MySQL-connector or PyMySQL. I had a user post this issue a few
> days ago which I’ve since distilled into test cases for PyMySQL and
> MySQL-connector separately. It uses gevent, not eventlet, so I’m
> not really sure if this applies. But there’s plenty of very smart
> people here so if anyone can shed some light on what is actually
> happening here, that would help.
>
> The program essentially illustrates code that performs several
> steps upon a connection, however if the greenlet is suddenly
> killed, the state from the connection, while damaged, is still
> being allowed to continue on in some way, and what’s
> super-catastrophic here is that you see a transaction actually
> being committed *without* all the statements proceeding on it.
>
> In my work with MySQL drivers, I’ve noted for years that they are
> all very, very bad at dealing with concurrency-related issues. The
> whole “MySQL has gone away” and “commands out of sync” errors are
> ones that we’ve all just drowned in, and so often these are due to
> the driver getting mixed up due to concurrent use of a connection.
> However this one seems more insidious. Though at the same time,
> the script has some complexity happening (like a simplistic
> connection pool) and I’m not really sure where the core of the
> issue lies.
>
> The script is at
> https://gist.github.com/zzzeek/d196fa91c40cb515365e and also below.
> If you run it for a few seconds, go over to your MySQL command line
> and run this query:
>
> SELECT * FROM table_b WHERE a_id not in (SELECT id FROM table_a)
> ORDER BY a_id DESC;
>
> and what you’ll see is tons of rows in table_b where the “a_id” is
> zero (because cursor.lastrowid fails), but the *rows are
> committed*. If you read the segment of code that does this, it
> should be impossible:
>
> connection = pool.get() rowid = execute_sql( connection, "INSERT
> INTO table_a (data) VALUES (%s)", ("a",) )
>
> gevent.sleep(random.random() * 0.2) try: execute_sql( connection,
> "INSERT INTO table_b (a_id, data) VALUES (%s, %s)", (rowid, "b",)
> ) connection.commit() pool.return_conn(connection)
>
> except Exception: connection.rollback()
> pool.return_conn(connection)
>
> so if the gevent.sleep() throws a timeout error, somehow we are
> getting thrown back in there, with the connection in an invalid
> state, but not invalid enough to commit.
>
> If a simple check for “SELECT connection_id()” is added, this query
> fails and the whole issue is prevented. Additionally, if you put a
> foreign key constraint on that b_table.a_id, then the issue is
> prevented, and you see that the constraint violation is happening
> all over the place within the commit() call. The connection is
> being used such that its state just started after the
> gevent.sleep() call.
>
> Now, there’s also a very rudimental connection pool here. That is
> also part of what’s going on. If i try to run without the pool,
> the whole script just runs out of connections, fast, which suggests
> that this gevent timeout cleans itself up very, very badly.
> However, SQLAlchemy’s pool works a lot like this one, so if folks
> here can tell me if the connection pool is doing something bad,
> then that’s key, because I need to make a comparable change in
> SQLAlchemy’s pool. Otherwise I worry our eventlet use could have
> big problems under high load.
>
>
>
>
>
> # -*- coding: utf-8 -*- import gevent.monkey
> gevent.monkey.patch_all()
>
> import collections import threading import time import random
> import sys
>
> import logging logging.basicConfig() log =
> logging.getLogger('foo') log.setLevel(logging.DEBUG)
>
> #import pymysql as dbapi from mysql import connector as dbapi
>
>
> class SimplePool(object): def __init__(self): self.checkedin =
> collections.deque([ self._connect() for i in range(50) ])
> self.checkout_lock = threading.Lock() self.checkin_lock =
> threading.Lock()
>
> def _connect(self): return dbapi.connect( user="scott",
> passwd="tiger", host="localhost", db="test")
>
> def get(self): with self.checkout_lock: while not self.checkedin:
> time.sleep(.1) return self.checkedin.pop()
>
> def return_conn(self, conn): try: conn.rollback() except:
> log.error("Exception during rollback", exc_info=True) try:
> conn.close() except: log.error("Exception during close",
> exc_info=True)
>
> # recycle to a new connection conn = self._connect() with
> self.checkin_lock: self.checkedin.append(conn)
>
>
> def verify_connection_id(conn): cursor = conn.cursor() try:
> cursor.execute("select connection_id()") row = cursor.fetchone()
> return row[0] except: return None finally: cursor.close()
>
>
> def execute_sql(conn, sql, params=()): cursor = conn.cursor()
> cursor.execute(sql, params) lastrowid = cursor.lastrowid
> cursor.close() return lastrowid
>
>
> pool = SimplePool()
>
> # SELECT * FROM table_b WHERE a_id not in # (SELECT id FROM
> table_a) ORDER BY a_id DESC;
>
> PREPARE_SQL = [ "DROP TABLE IF EXISTS table_b", "DROP TABLE IF
> EXISTS table_a", """CREATE TABLE table_a ( id INT NOT NULL
> AUTO_INCREMENT, data VARCHAR (256) NOT NULL, PRIMARY KEY (id) )
> engine='InnoDB'""", """CREATE TABLE table_b ( id INT NOT NULL
> AUTO_INCREMENT, a_id INT NOT NULL, data VARCHAR (256) NOT NULL, --
> uncomment this to illustrate where the driver is attempting -- to
> INSERT the row during ROLLBACK -- FOREIGN KEY (a_id) REFERENCES
> table_a(id), PRIMARY KEY (id) ) engine='InnoDB' """]
>
> connection = pool.get() for sql in PREPARE_SQL:
> execute_sql(connection, sql) connection.commit()
> pool.return_conn(connection) print("Table prepared...")
>
>
> def transaction_kill_worker(): while True: try: connection = None
> with gevent.Timeout(0.1): connection = pool.get() rowid =
> execute_sql( connection, "INSERT INTO table_a (data) VALUES (%s)",
> ("a",)) gevent.sleep(random.random() * 0.2)
>
> try: execute_sql( connection, "INSERT INTO table_b (a_id, data)
> VALUES (%s, %s)", (rowid, "b",))
>
> # this version prevents the commit from # proceeding on a bad
> connection # if verify_connection_id(connection): #
> connection.commit()
>
> # this version does not. It will commit the # row for table_b
> without the table_a being present. connection.commit()
>
> pool.return_conn(connection) except Exception:
> connection.rollback() pool.return_conn(connection)
> sys.stdout.write("$") except gevent.Timeout: # try to return the
> connection anyway if connection is not None:
> pool.return_conn(connection) sys.stdout.write("#") except
> Exception: # logger.exception(e) sys.stdout.write("@") else:
> sys.stdout.write(".") finally: if connection is not None:
> pool.return_conn(connection)
>
>
> def main(): for i in range(50):
> gevent.spawn(transaction_kill_worker)
>
> gevent.sleep(3)
>
> while True: gevent.sleep(5)
>
>
> if __name__ == "__main__": main()
>
>
>
>
>
>
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
iQEcBAEBCgAGBQJUivtJAAoJEC5aWaUY1u57nn8IAJP5zK/htG8EeoOSWZVV1ksA
en+lQuIA09aNdkHSNS1b/lNOYwsF4X5SM0dU5Cs4LCsumC5jM9S/cNOn3sfpVooA
vd31O1kKtd255YtnsKSmrOPiytoI69n2/65tVqgWLHpuXRSaj4HtqOEY/vOWMX6g
BON50QUYwwxAZLNOPmEO7vUnJ3VYO6zquH2mQrA1Vg/LCm3+VaodEHOVCxieaJ/n
iQPB4Vx1dkuP10HzWyjQW0j4kbUakqgkq/VHaiCYNC85HzPz6KJUOK/neZcBrWsZ
RQcLae1dX1yGMXDd5hyJaoe3qUfjuvSZmV5jS3ok/x8rnKdmVl65PtUUlLfVOU0=
=wJ55
-----END PGP SIGNATURE-----
More information about the OpenStack-dev
mailing list