[oslo][relmgmt][taskflow][nova] fasteners.ReaderWriterLock and oslo.concurrency's fair lock is broken if used in eventlet
Hi, It is hard to properly figure out which OpenStack project is affected exactly. But I want to give a heads up. If the following is true for your project then you are affected: * using eventlet and * using oslo.concurrency fair internal lock (external lock or non fair locks are not affected) or using fasteners.ReaderWriterLock directly. I know that nova is affected and based on code search taskflow is affected at least. The oslo.concurrency's fair lock uses fasteners.ReaderWriterLock[1]. The fasteners.ReaderWriterLock relies on threading.current_thread to identify a thread and decide if a thread already has a lock and therefore can reenter the lock. However in an eventlet monkey patched environment, a thread that is created with eventlet.spawn_n() or the patched threading.Thread() the threading.current_thread call does not return an eventlet unique ID. This causes that the lock can be reentered from multiple eventlets at the same time. We have 4 ways forward: 0) Fix eventlet. There is an open issue in eventlet[3] for this open since October 2021. Based on the ticket this direction does not seems to feasible. 1) Sean recently opened an issue[4] on fasteners to restore a previously existing workaround that could fix our issues. If an new fasteners lib is released with the workaround[5] then at least oslo.concurrency requirements needs to be bumped and a new oslo release is pushed. 2) If the fasteners' maintainer does not accept [4][5] in time then I have an oslo.concurrency patch[6] that implements a workaround in oslo. This also requires a new olso.concurrency release. Also this means that projects that are using fasteners.ReaderWriterLock directly need to re-implement the fix locally. 3) If all odds fails I have a nova only patch[7] that implements the workaround locally in nova. Note that this issue is present in stable/yoga and on master. On stable/xena we uses fasteners < 0.15.0 which is not affected. Cheers, gibi [1] https://github.com/openstack/oslo.concurrency/blob/052b2f23572900601b0f41387... [2] https://bugs.launchpad.net/oslo.concurrency/+bug/1988311 [3] https://github.com/eventlet/eventlet/issues/731 [4] https://github.com/harlowja/fasteners/issues/96 [5] https://github.com/harlowja/fasteners/pull/97 [6] https://review.opendev.org/q/topic:bug/1988311+project:openstack/oslo.concur... [7] https://review.opendev.org/q/topic:bug/1988311+project:openstack/nova
The oslo patches have been approved, I'll propose a new release, in a couple of hours, when they will be merged. Le lun. 5 sept. 2022 à 18:27, Balazs Gibizer <gibi@redhat.com> a écrit :
Hi,
It is hard to properly figure out which OpenStack project is affected exactly. But I want to give a heads up. If the following is true for your project then you are affected: * using eventlet and * using oslo.concurrency fair internal lock (external lock or non fair locks are not affected) or using fasteners.ReaderWriterLock directly.
I know that nova is affected and based on code search taskflow is affected at least.
The oslo.concurrency's fair lock uses fasteners.ReaderWriterLock[1]. The fasteners.ReaderWriterLock relies on threading.current_thread to identify a thread and decide if a thread already has a lock and therefore can reenter the lock. However in an eventlet monkey patched environment, a thread that is created with eventlet.spawn_n() or the patched threading.Thread() the threading.current_thread call does not return an eventlet unique ID. This causes that the lock can be reentered from multiple eventlets at the same time.
We have 4 ways forward:
0) Fix eventlet. There is an open issue in eventlet[3] for this open since October 2021. Based on the ticket this direction does not seems to feasible.
1) Sean recently opened an issue[4] on fasteners to restore a previously existing workaround that could fix our issues. If an new fasteners lib is released with the workaround[5] then at least oslo.concurrency requirements needs to be bumped and a new oslo release is pushed.
2) If the fasteners' maintainer does not accept [4][5] in time then I have an oslo.concurrency patch[6] that implements a workaround in oslo. This also requires a new olso.concurrency release. Also this means that projects that are using fasteners.ReaderWriterLock directly need to re-implement the fix locally.
3) If all odds fails I have a nova only patch[7] that implements the workaround locally in nova.
Note that this issue is present in stable/yoga and on master. On stable/xena we uses fasteners < 0.15.0 which is not affected.
Cheers, gibi
[1]
https://github.com/openstack/oslo.concurrency/blob/052b2f23572900601b0f41387... [2] https://bugs.launchpad.net/oslo.concurrency/+bug/1988311 [3] https://github.com/eventlet/eventlet/issues/731 [4] https://github.com/harlowja/fasteners/issues/96 [5] https://github.com/harlowja/fasteners/pull/97 [6]
https://review.opendev.org/q/topic:bug/1988311+project:openstack/oslo.concur... [7] https://review.opendev.org/q/topic:bug/1988311+project:openstack/nova
-- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud
The patches are now merged and a new release is on the way https://review.opendev.org/c/openstack/releases/+/856037 Once the release patch is merged I'll request a RFE for an UC bump for oslo.concurrency which is an independent deliverable. Le mar. 6 sept. 2022 à 09:41, Herve Beraud <hberaud@redhat.com> a écrit :
The oslo patches have been approved, I'll propose a new release, in a couple of hours, when they will be merged.
Le lun. 5 sept. 2022 à 18:27, Balazs Gibizer <gibi@redhat.com> a écrit :
Hi,
It is hard to properly figure out which OpenStack project is affected exactly. But I want to give a heads up. If the following is true for your project then you are affected: * using eventlet and * using oslo.concurrency fair internal lock (external lock or non fair locks are not affected) or using fasteners.ReaderWriterLock directly.
I know that nova is affected and based on code search taskflow is affected at least.
The oslo.concurrency's fair lock uses fasteners.ReaderWriterLock[1]. The fasteners.ReaderWriterLock relies on threading.current_thread to identify a thread and decide if a thread already has a lock and therefore can reenter the lock. However in an eventlet monkey patched environment, a thread that is created with eventlet.spawn_n() or the patched threading.Thread() the threading.current_thread call does not return an eventlet unique ID. This causes that the lock can be reentered from multiple eventlets at the same time.
We have 4 ways forward:
0) Fix eventlet. There is an open issue in eventlet[3] for this open since October 2021. Based on the ticket this direction does not seems to feasible.
1) Sean recently opened an issue[4] on fasteners to restore a previously existing workaround that could fix our issues. If an new fasteners lib is released with the workaround[5] then at least oslo.concurrency requirements needs to be bumped and a new oslo release is pushed.
2) If the fasteners' maintainer does not accept [4][5] in time then I have an oslo.concurrency patch[6] that implements a workaround in oslo. This also requires a new olso.concurrency release. Also this means that projects that are using fasteners.ReaderWriterLock directly need to re-implement the fix locally.
3) If all odds fails I have a nova only patch[7] that implements the workaround locally in nova.
Note that this issue is present in stable/yoga and on master. On stable/xena we uses fasteners < 0.15.0 which is not affected.
Cheers, gibi
[1]
https://github.com/openstack/oslo.concurrency/blob/052b2f23572900601b0f41387... [2] https://bugs.launchpad.net/oslo.concurrency/+bug/1988311 [3] https://github.com/eventlet/eventlet/issues/731 [4] https://github.com/harlowja/fasteners/issues/96 [5] https://github.com/harlowja/fasteners/pull/97 [6]
https://review.opendev.org/q/topic:bug/1988311+project:openstack/oslo.concur... [7] https://review.opendev.org/q/topic:bug/1988311+project:openstack/nova
-- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud
-- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud
participants (2)
-
Balazs Gibizer
-
Herve Beraud