[openstack-dev] [Neutron] Issue with pymysql
clint at fewbar.com
Fri Jun 12 17:56:14 UTC 2015
Excerpts from Mike Bayer's message of 2015-06-12 09:42:42 -0700:
> On 6/12/15 11:37 AM, Mike Bayer wrote:
> > On 6/11/15 9:32 PM, Eugene Nikanorov wrote:
> >> Hi neutrons,
> >> I'd like to draw your attention to an issue discovered by rally gate job:
> >> http://logs.openstack.org/96/190796/4/check/gate-rally-dsvm-neutron-rally/7a18e43/logs/screen-q-svc.txt.gz?level=TRACE
> >> I don't have bandwidth to take a deep look at it, but first
> >> impression is that it is some issue with nested transaction support
> >> either on sqlalchemy or pymysql side.
> >> Also, besides errors with nested transactions, there are a lot of
> >> Lock wait timeouts.
> >> I think it makes sense to start with reverting the patch that moves
> >> to pymysql.
> > My immediate reaction is that this is perhaps a concurrency-related
> > issue; because PyMySQL is pure python and allows for full blown
> > eventlet monkeypatching, I wonder if somehow the same PyMySQL
> > connection is being used in multiple contexts. E.g. one greenlet
> > starts up a savepoint, using identifier "_3" which is based on a
> > counter that is local to the SQLAlchemy Connection, but then another
> > greenlet shares that PyMySQL connection somehow with another
> > SQLAlchemy Connection that uses the same identifier.
> reading more of the log, it seems the main issue is just that there's a
> deadlock on inserting into the securitygroups table. The deadlock on
> insert can be because of an index being locked.
> I'd be curious to know how many greenlets are running concurrently here,
> and what the overall transaction looks like within the operation that is
> failing here (e.g. does each transaction insert multiple rows into
> securitygroups? that would make a deadlock seem more likely).
This begs two questions:
1) Are we handling deadlocks with retries? It's important that we do
that to be defensive.
2) Are we being careful to sort the table order in any multi-table
transactions so that we minimize the chance of deadlocks happening
because of any cross table deadlocks?
More information about the OpenStack-dev