[openstack-dev] [Neutron] Issue with pymysql

Armando M. armamig at gmail.com
Mon Jun 22 21:10:48 UTC 2015


Hi,

A brief update on the issue that sparked this thread:

A little over a week ago, bug [1] was filed. The gist of that was that the
switch to pymysql unveiled a number of latent race conditions that made
Neutron unstable.

To try and nip these in the bud, the Neutron team filed a number of patches
[2], to create an unstable configuration that would allow them to
troubleshoot and experiment a solution, by still keeping the stability in
check (a preliminary proposal for a fix has been available in [4]).

The latest failure rate trend is shown in [3]; as you can see, we're still
gathering data, but it seems that the instability gap between the two jobs
(stable vs unstable) has widened, and should give us plenty of data points
to devise a resolution strategy.

I have documented the most recurrent traces in the bug report [1].

Will update once we managed to get the two curves to kiss each other again
and close to a more acceptable failure rate.

Cheers,
Armando

[1] https://bugs.launchpad.net/neutron/+bug/1464612
[2] https://review.openstack.org/#/q/topic:neutron-unstable,n,z
[3] http://goo.gl/YM7gUC
[4] https://review.openstack.org/#/c/191540/


On 12 June 2015 at 11:13, Boris Pavlovic <bpavlovic at mirantis.com> wrote:

> Sean,
>
> Thanks for quick fix/revert https://review.openstack.org/#/c/191010/
> This unblocked Rally gates...
>
> Best regards,
> Boris Pavlovic
>
> On Fri, Jun 12, 2015 at 8:56 PM, Clint Byrum <clint at fewbar.com> wrote:
>
>> Excerpts from Mike Bayer's message of 2015-06-12 09:42:42 -0700:
>> >
>> > On 6/12/15 11:37 AM, Mike Bayer wrote:
>> > >
>> > >
>> > > On 6/11/15 9:32 PM, Eugene Nikanorov wrote:
>> > >> Hi neutrons,
>> > >>
>> > >> I'd like to draw your attention to an issue discovered by rally gate
>> job:
>> > >>
>> http://logs.openstack.org/96/190796/4/check/gate-rally-dsvm-neutron-rally/7a18e43/logs/screen-q-svc.txt.gz?level=TRACE
>> > >>
>> > >> I don't have bandwidth to take a deep look at it, but first
>> > >> impression is that it is some issue with nested transaction support
>> > >> either on sqlalchemy or pymysql side.
>> > >> Also, besides errors with nested transactions, there are a lot of
>> > >> Lock wait timeouts.
>> > >>
>> > >> I think it makes sense to start with reverting the patch that moves
>> > >> to pymysql.
>> > > My immediate reaction is that this is perhaps a concurrency-related
>> > > issue; because PyMySQL is pure python and allows for full blown
>> > > eventlet monkeypatching, I wonder if somehow the same PyMySQL
>> > > connection is being used in multiple contexts. E.g. one greenlet
>> > > starts up a savepoint, using identifier "_3" which is based on a
>> > > counter that is local to the SQLAlchemy Connection, but then another
>> > > greenlet shares that PyMySQL connection somehow with another
>> > > SQLAlchemy Connection that uses the same identifier.
>> >
>> > reading more of the log, it seems the main issue is just that there's a
>> > deadlock on inserting into the securitygroups table.  The deadlock on
>> > insert can be because of an index being locked.
>> >
>> >
>> > I'd be curious to know how many greenlets are running concurrently here,
>> > and what the overall transaction looks like within the operation that is
>> > failing here (e.g. does each transaction insert multiple rows into
>> > securitygroups?  that would make a deadlock seem more likely).
>>
>> This begs two questions:
>>
>> 1) Are we handling deadlocks with retries? It's important that we do
>> that to be defensive.
>>
>> 2) Are we being careful to sort the table order in any multi-table
>> transactions so that we minimize the chance of deadlocks happening
>> because of any cross table deadlocks?
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150622/1b533a77/attachment.html>


More information about the OpenStack-dev mailing list