[openstack-dev] [Infra][Fuel] Increasing deadlock_timeout for PostgreSQL

Roman Prykhodchenko me at romcheg.me
Mon Mar 21 16:39:52 UTC 2016


Folks,

We have been analyzing a bunch of random failures in Fuel tests and encountered several ones caused by detector raising errors occasionally [1]. After attempts to reproduce the same behavior have failed we’ve decided to run the same test suit on overloaded nodes. Those test-runs allowed us to catch the same behavior we’ve seen on CI slaves. After analyzing both PostgreSQL logs and Nailgun’s code we’ve found no reasons for those deadlocks to occur.

Thinking about the facts mentioned we came up with the idea that those random deadlocks occur in cases when CI slaves are overloaded by other jobs and transactions start hitting deadlock timeout. Thus I propose to change PostgreSQL’s deadlock_timeout value from the default one to 3-5 seconds. That will slow down tests, if they run on an overloaded CI slave but will help to avoid random and false-positive deadlock warnings.


References:

1. https://bugs.launchpad.net/fuel/+bug/1556070


- romcheg


More information about the OpenStack-dev mailing list