Hello,

Not sure if this is the root cause of your problem, but that could:
- https://mariadb.com/resources/blog/isolation-level-violation-testing-and-debugging-in-mariadb/
- https://jira.mariadb.org/browse/MDEV-35124

End of last week we noticed strange behaviours due to this new transactional model, see https://bugs.launchpad.net/nova/+bug/2116186/.

Le lun. 14 juil. 2025 à 19:43, Winicius Allan <winiciusab12@gmail.com> a écrit :
Hi community o/

Suddenly, I noticed that the login on the dashboard wasn't working, so I checked the keystone logs. I saw some entries like that:

2025-07-10 00:53:45.120 23 ERROR sqlalchemy.pool.impl.QueuePool oslo_db.exception.DBDeadlock: (pymysql.err.OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction')

Nova-conductor was also showing messages like these.

At the same time, blazer-manager was logging after every _process_event calling:

ERROR sqlalchemy.exc.TimeoutError: QueuePool limit of size 1 overflow 50 reached, connection timed out, timeout 30.00

I do not know if it's related to the _process_event method that batches events concurrently[1], and the DB was locked. Well... I increased the max_pool_size from 1 to 2 on blazar-manager, and that message stops showing. Does anyone know what can cause this deadlock generally?

I'd like to ask another question: I have three controllers in my environment. Why do controllers with no active VIP address bind connections on the MySQL port, for example:

tcp   ESTAB      0      0               10.0.0.105:3306           10.0.0.107:50642 users:(("mariadbd",pid=977010,fd=1089))

[1] https://github.com/openstack/blazar/blob/stable/2024.1/blazar/manager/service.py#L243

Best regards.


--
Hervé Beraud
Principal Software Engineer at Red Hat
irc: hberaud