Re: [mariadb][keystone][blazar] Deadlock error on Galera cluster

17 Jul 2025

      And happened again... Does anyone have a clue how to troubleshoot this?

Apparently, the MariaDB service got locked: nova conductor shows on the logs

2025-07-17 18:28:07.567 30 ERROR nova.servicegroup.drivers.db
pymysql.err.OperationalError: (1205, 'Lock wait timeout exceeded; try
restarting transaction')

And only nova service, login on the dashboard with Keystone, it was all
good.

Regards.

Em seg., 14 de jul. de 2025 às 14:42, Winicius Allan <winiciusab12@gmail.com>
escreveu:
...
Hi community o/
Suddenly, I noticed that the login on the dashboard wasn't working, so I
checked the keystone logs. I saw some entries like that:
2025-07-10 00:53:45.120 23 ERROR sqlalchemy.pool.impl.QueuePool
oslo_db.exception.DBDeadlock: (pymysql.err.OperationalError) (1205, 'Lock
wait timeout exceeded; try restarting transaction')
Nova-conductor was also showing messages like these.
At the same time, blazer-manager was logging after every _process_event
calling:
ERROR sqlalchemy.exc.TimeoutError: QueuePool limit of size 1 overflow 50
reached, connection timed out, timeout 30.00
I do not know if it's related to the _process_event method that batches
events concurrently[1], and the DB was locked. Well... I increased the
max_pool_size from 1 to 2 on blazar-manager, and that message stops
showing. Does anyone know what can cause this deadlock generally?
I'd like to ask another question: I have three controllers in my
environment. Why do controllers with no active VIP address bind connections
on the MySQL port, for example:
tcp   ESTAB      0      0               10.0.0.105:3306
10.0.0.107:50642 users:(("mariadbd",pid=977010,fd=1089))
[1]
https://github.com/openstack/blazar/blob/stable/2024.1/blazar/manager/servic...
Best regards.