<tt><font size=2>Hello everyone,</font></tt>
<br>
<br><tt><font size=2>We have been investigating the cause behind the Jenkins
Check gate-tempest-dsvm-networking-ovn failures (non-voting at the moment).
The failures have been happening pretty consistently with every commit.
I wanted to start a conversation to get some input as to why these errors
may be happening.</font></tt>
<br>
<br><tt><font size=2>One kind of error is related to the following (from
the q-svc logs).</font></tt>
<br>
<br><tt><font size=1>2015-08-04 05:40:28.313 ERROR neutron.agent.ovsdb.impl_idl
[req-c189268a-1e1d-462f-a81e-62f0a34ff490 tempest-FloatingIPAdminTestJSON-1706130555
tempest-FloatingIPAdminTestJSON-1943105894] Traceback (most recent call
last):</font></tt>
<br><tt><font size=1> File "/opt/stack/new/neutron/neutron/agent/ovsdb/native/connection.py",
line 84, in run</font></tt>
<br><tt><font size=1> txn.results.put(txn.do_commit())</font></tt>
<br><tt><font size=1> File "/opt/stack/new/neutron/neutron/agent/ovsdb/impl_idl.py",
line 99, in do_commit</font></tt>
<br><tt><font size=1> seqno)</font></tt>
<br><tt><font size=1> File "/opt/stack/new/neutron/neutron/agent/ovsdb/native/idlutils.py",
line 125, in wait_for_change</font></tt>
<br><tt><font size=1> raise Exception("Timeout")</font></tt>
<br><tt><font size=1>Exception: Timeout</font></tt>
<br>
<br><tt><font size=2>When this error happens - in a separate thread there
is DB Deadlock. Note that it's not always create_port (65%), it could be
delete_port (30%), other calls (5%). There are many more of these errors
(show below) than the above error.</font></tt>
<br>
<br><tt><font size=2>But it is always: SQL: u'UPDATE ipavailabilityranges
SET first_ip=%s WHERE ipavailabilityranges.allocation_pool_id = %s AND
ipavailabilityranges.first_ip = %s AND ipavailabilityranges.last_ip = %s']</font></tt>
<br>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py",
line 136, in wrapper</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
return f(*args, **kwargs)</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
File "/opt/stack/new/networking-ovn/networking_ovn/plugin.py",
line 275, in create_port</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
db_port = super(OVNPlugin, self).create_port(context, port)</font></tt>
<br><tt><font size=1>...</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/cursors.py",
line 205, in execute</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
self.errorhandler(self, exc, value)</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py",
line 36, in defaulterrorhandler</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api
raise errorclass, errorvalue</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api DBDeadlock:
(_mysql_exceptions.OperationalError) (1205, 'Lock wait timeout exceeded;
try restarting transaction') [SQL: u'UPDATE ipavailabilityranges SET first_ip=%s
WHERE ipavailabilityranges.allocation_pool_id = %s AND ipavailabilityranges.first_ip
= %s AND ipavailabilityranges.last_ip = %s'] [parameters: ('10.100.0.3',
'851466c3-8d6b-4629-bf65-86be2f403e67', '10.100.0.2', '10.100.0.14')]</font></tt>
<br><tt><font size=1>2015-08-04 05:39:37.303 9407 ERROR oslo_db.api </font></tt>
<br>
<br><tt><font size=2>Russell suggested removing the MYSQL_DRIVER=MySQL-python
declaration from local.conf </font></tt><a href=https://review.openstack.org/#/c/216413/><tt><font size=2 color=blue>https://review.openstack.org/#/c/216413/</font></tt></a><tt><font size=2>
which results in PyMySQL as the default.</font></tt>
<br>
<br><tt><font size=2>With the above change the above DB errors are no longer
seen in my local setup, the CI setup is having trouble with the </font></tt><a href="http://logs.openstack.org/13/216413/1/check/gate-networking-ovn-python27/1f9be86/"><font size=2 color=#0060a0>gate-networking-ovn-python27</font></a><tt><font size=2>
test now therefore the </font></tt><a href="http://logs.openstack.org/13/216413/1/check/gate-tempest-dsvm-networking-ovn/1bc5757/"><font size=2 color=#0060a0>gate-tempest-dsvm-networking-ovn</font></a><tt><font size=2>
never runs.</font></tt>
<br>
<br><tt><font size=2>So there are 2 questions:</font></tt>
<br>
<ol>
<li value=1><tt><font size=2>Is there any impact of using PyMySQL for the
Jenkins check and gates.</font></tt>
<li value=2><tt><font size=2>Why is the </font></tt><font size=2 color=#0060a0>gate-networking-ovn-python27</font><tt><font size=2><b>
</b>failing (the past couple of commits) in {0} networking_ovn.tests.unit.test_ovn_plugin.TestOvnPlugin.test_create_port_security
[0.194020s] ... FAILED. Do we need another conversation to track this?</font></tt>
<li value=3></ol><tt><font size=2>Amitabha</font></tt>
<ol>
<li value=1></ol>