[openstack-dev] [Infra][Fuel] Increasing deadlock_timeout for PostgreSQL

Igor Kalnitsky ikalnitsky at mirantis.com
Mon Mar 21 17:30:09 UTC 2016


Hey Roman,

Thank you for investigation. However, I think that changing
'deadlock_timeout' won't help us. According to PostgreSQL
documentation [1], this option sets how frequently to check if there
is a deadlock condition. So it won't fix deadlocks themselves.

Thus I see no reason why we should change that option and wait more
time before raising the deadlock exception.

- Igor

[1]: http://www.postgresql.org/docs/9.4/static/runtime-config-locks.html

On Mon, Mar 21, 2016 at 6:39 PM, Roman Prykhodchenko <me at romcheg.me> wrote:
> Folks,
>
> We have been analyzing a bunch of random failures in Fuel tests and encountered several ones caused by detector raising errors occasionally [1]. After attempts to reproduce the same behavior have failed we’ve decided to run the same test suit on overloaded nodes. Those test-runs allowed us to catch the same behavior we’ve seen on CI slaves. After analyzing both PostgreSQL logs and Nailgun’s code we’ve found no reasons for those deadlocks to occur.
>
> Thinking about the facts mentioned we came up with the idea that those random deadlocks occur in cases when CI slaves are overloaded by other jobs and transactions start hitting deadlock timeout. Thus I propose to change PostgreSQL’s deadlock_timeout value from the default one to 3-5 seconds. That will slow down tests, if they run on an overloaded CI slave but will help to avoid random and false-positive deadlock warnings.
>
>
> References:
>
> 1. https://bugs.launchpad.net/fuel/+bug/1556070
>
>
> - romcheg
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list