[openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client

Ihar Hrachyshka ihrachys at redhat.com
Fri Sep 12 10:41:42 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Some updates/concerns/questions.

The status of introducing a new driver to gate is:

- - all the patches for mysql-connector are merged in all projects;
- - all devstack patches to support switching the driver are merged;
- - new sqlalchemy-migrate library is released;

- - version bump is *not* yet done;
- - package is still *not* yet published on pypi;
- - new gate job is *not* yet introduced.

The new sqlalchemy-migrate release introduced unit test failures in
those three projects: nova, cinder, glance.

On technical side of the failure: my understanding is that those
projects that started to fail assume too much about how those SQL
scripts are executed. They assume they are executed in one go, they
also assume they need to open and commit transaction on their own. I
don't think this is something to be fixed in sqlalchemy-migrate
itself. Instead, simple removal of those 'BEGIN TRANSACTION; ...
COMMIT;' statements should just work and looks like a sane thing to do
anyway. I've proposed the following patches for all three projects to
handle it [1].

That said, those failures were solved by pinning the version of the
library in openstack/requirements and those projects. This is in major
contrast to how we handled the new testtools release just several
weeks ago, when the problem was solved by fixing three affected
projects because of their incorrect usage of tearDown/setUp methods.

Even more so, those failures seem to trigger the resolution to move
the enable-mysql-connector oslo spec to Kilo, while the library
version bump is the *only* change missing codewise (we will also need
a gate job description, but that doesn't touch codebase at all). The
resolution looks too prompt and ungrounded to me. Is it really that
gate failure for three projects that resulted in it, or there are some
other hidden reasons behind it? Was it discussed anywhere? If so, I
wasn't given a chance to participate in that discussion; I suspect
another supporter of the spec (Agnus Lees) was not involved either.

Not allowing those last pieces of the spec in this cycle, we just
postpone start of any realistic testing of the feature for another
half a year.

Why do we block new sqlalchemy-migrate and the spec for another cycle
instead of fixing the affected projects with *primitive* patches like
we did for new testtools?

[1]:
https://review.openstack.org/#/q/I10c58b3af75d3ab9153a8bbd2a539bf1577de328,n,z

/Ihar

On 09/07/14 13:17, Ihar Hrachyshka wrote:
> Hi all,
> 
> Multiple projects are suffering from db lock timeouts due to
> deadlocks deep in mysqldb library that we use to interact with
> mysql servers. In essence, the problem is due to missing eventlet
> support in mysqldb module, meaning when a db lock is encountered,
> the library does not yield to the next green thread, allowing other
> threads to eventually unlock the grabbed lock, and instead it just
> blocks the main thread, that eventually raises timeout exception
> (OperationalError).
> 
> The failed operation is not retried, leaving failing request not 
> served. In Nova, there is a special retry mechanism for deadlocks, 
> though I think it's more a hack than a proper fix.
> 
> Neutron is one of the projects that suffer from those timeout
> errors a lot. Partly it's due to lack of discipline in how we do
> nested calls in l3_db and ml2_plugin code, but that's not something
> to change in foreseeable future, so we need to find another
> solution that is applicable for Juno. Ideally, the solution should
> be applicable for Icehouse too to allow distributors to resolve
> existing deadlocks without waiting for Juno.
> 
> We've had several discussions and attempts to introduce a solution
> to the problem. Thanks to oslo.db guys, we now have more or less
> clear view on the cause of the failures and how to easily fix them.
> The solution is to switch mysqldb to something eventlet aware. The
> best candidate is probably MySQL Connector module that is an
> official MySQL client for Python and that shows some (preliminary)
> good results in terms of performance.
> 
> I've posted a Neutron spec for the switch to the new client in Juno
> at [1]. Ideally, switch is just a matter of several fixes to
> oslo.db that would enable full support for the new driver already
> supported by SQLAlchemy, plus 'connection' string modified in
> service configuration files, plus documentation updates to refer to
> the new official way to configure services for MySQL. The database
> code won't, ideally, require any major changes, though some
> adaptation for the new client library may be needed. That said,
> Neutron does not seem to require any changes, though it was
> revealed that there are some alembic migration rules in Keystone or
> Glance that need (trivial) modifications.
> 
> You can see how trivial the switch can be achieved for a service
> based on example for Neutron [2].
> 
> While this is a Neutron specific proposal, there is an obvious wish
> to switch to the new library globally throughout all the projects,
> to reduce devops burden, among other things. My vision is that,
> ideally, we switch all projects to the new library in Juno, though
> we still may leave several projects for K in case any issues arise,
> similar to the way projects switched to oslo.messaging during two
> cycles instead of one. Though looking at how easy Neutron can be
> switched to the new library, I wouldn't expect any issues that
> would postpone the switch till K.
> 
> It was mentioned in comments to the spec proposal that there were
> some discussions at the latest summit around possible switch in
> context of Nova that revealed some concerns, though they do not
> seem to be documented anywhere. So if you know anything about it,
> please comment.
> 
> So, we'd like to hear from other projects what's your take on that 
> move, whether you see any issues or have concerns about it.
> 
> Thanks for your comments, /Ihar
> 
> [1]: https://review.openstack.org/#/c/104905/ [2]:
> https://review.openstack.org/#/c/105209/
> 
> _______________________________________________ OpenStack-dev
> mailing list OpenStack-dev at lists.openstack.org 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iQEcBAEBCgAGBQJUEs3mAAoJEC5aWaUY1u570H4H/0eUzbM1ThuhegzfwK3CH40l
Zaenrpc4NvYd1vAZbfusKxogqEfUnJpkOFxm2/wd9EXEtoV85NSy/wnOKgX6Av9C
FYB78kCQG45+6o5/fz35NUJOrR6tJAryOyBEKVaZm4dvaIk/zKwTp4J1qrj/Rq+g
Ux5RGYSeGnIH1TQKpwk2+egkXX13P6BY4Kx8/xU6g3e/7scEHpirsyyRZUYxn9Tb
nUCxdZRwdE0nxDFbHCq8jgMs9nCCcRzEEMwnEaPo163o0VSSVEbYWtEtSaYaAEpK
EQbuXoQ7g5WkStxYebmwhbHl6M/Vgaa9RUt1IrdGAxFsN2MkI+MkMX9sXFUXkwQ=
=K0Nf
-----END PGP SIGNATURE-----



More information about the OpenStack-dev mailing list