Hi,
As context, I've been working on submitting PRs to eventlet on behalf of G-Research OSS, with the initial goal of getting 3.12 support. Here's some thoughts on the problems I've seen with eventlet, and a potential solution, based on what I've learned:
## Short-term problem: lack of maintenance
Symptoms:
1. Tests don't pass, locally or in CI. 2. CI doesn't run at all for Python 3.11. 3. PR backlog.
If it's possible to add more maintainers, and put some engineering time into it, this is all solvable. In addition to any resources OpenStack companies to bring to bear there's also usually some potential maintainers among people who have submitted PRs. I've personally successfully revived one project (SparkMagic), initially just going through backlog, and then I managed to recruit someone who has been maintaining it for a few years.
It's possible that if a bunch of you opened an issue saying "hey we rely on eventlet, we understand you don't have time to work on this, can you give commit access to person X who we trust" you might get somewhere.
## Long-term problem: Monkeypatching is a technological dead-end
The premise of eventlet is drop-in compatibility via monkeypatching. Unfortunately that quite possibly hasn't been true for a long time, and it's becoming increasingly more difficult over time.
### Example #1: Compliance suite
Per the docs, "Eventlet provides the ability to test itself with the standard Python networking tests. This verifies that the libraries it wraps work at least as well as the standard ones do." (https://github.com/eventlet/eventlet/blob/master/doc/testing.rst#standard-li...)
That is, eventlet will run the Python standard library's test suite against eventlet to make sure it's compatible.
Unfortunately, this testing mechanism was never updated for Python 3.
As such, it's basically designed for Python 2.7, and there has been 13 major releases of Python since then. Is eventlet still compatible with the standard library? It's hard to say, but quite likely not.
### Example #2: RLock
When eventlet was originally written, `threading.RLock` was written in Python. This has a bug, e.g. it didn't actually work in the face of signals: https://bugs.python.org/issue13697 (there's a bunch of comments in the ticket from people encountering this in the real world, logging being a common situation.)
The problem doesn't occur in the version of RLock which is written in C, which is the current default and was introduced in Python 3.2.
However, the C version of RLock doesn't work with eventlet, so eventlet has been monkeypatching `threading.RLock`, replacing it with the (buggy and unfixable) version written in Python (`threading._PyRLock_`).
In 3.11 this gets worse, as the RLock version written in Python has become subtly incompatible with eventlet's expectations. To get the eventlet test suite passing on 3.11 I had to copy/paste the Python RLock code and tweak it (https://github.com/eventlet/eventlet/pull/823/files#diff-029df1ae9b7431e9cdd... 8816R427).
So now eventlet has to use a forked version of a broken implementation of RLock. It's possible there's another solution, but eventlet basically relies on monkeypatching a whole bunch of functions and on implementation details of Python standard library using those functions in particular ways, which are not always stable over time.
This problem will continue to get worse as Python evolves. E.g. it would not surprise if the GIL removal makes things even more difficult for eventlet.
Is all this solvable? Yes, but at at a potentially significant engineering cost.
## A potential path for migrating off of eventlet
Like many networking frameworks, eventlet has pluggable event loops, in this case called a "hub". Typically these wrap things like select() and epoll(), but there also used to be a hub that ran on Twisted.
My hypothesis is that it should be possible to:
1. Create a hub that runs on asyncio. At this point asyncio and eventlet code can run in the same event loop. 2. Create a little bit of glue so that eventlet code can wait for the result of a coroutine object (i.e. the result of calling an `async def` function). 3. Make `eventlet.greenlet.GreenThread` something you can `await` in an `async def` function.
The end result is that asyncio code can call eventlet code, and eventlet code can call asyncio code. There is an existing integration on this model for gevent, as reference: https://pypi.org/project/asyncio-gevent/
If this works, it would allow migrating from eventlet to asyncio in a gradual manner both within and across projects:
1. Within an OpenStack project, code could be gradually switched to async functions and asyncio libraries. 2. When an OpenStack library fully migrates to asyncio, it will still be usable by anything that is still running on eventlet. so long term i think that is the approch we should take and i agree that mediaum ot short term
On Thu, 2023-11-30 at 17:11 +0000, itamar@pythonspeed.com wrote: the base way to do that is likely to create oslo.aiohub or similar that has an asyncio backed eventloop that we can enable instead fo the default eventlet one. realsitcally howeer we really do need to drive this as cross community effort and we need to get peopel form multiple teams to sign up to do that work. im not really sure i can get invovled in this in cycle but if there were patches to nova proposed to optionally enabel this alternitive hub i would review them and i thinkt his is something we could set a s a comunity goal to try and do in D/E i know that is late for distros but 3.12 is not in the set of python runtimes for caracal https://governance.openstack.org/tc/reference/runtimes/2024.1.html i woudl like to see it in the required runtimes for D 2024.2 and if we want that to happen we need to resolve the enventlet issues. starting in 2024.1 to enable as much 3.12 support as we can is not a bad thing im just not sure its something upstream can really commit to doing before the D/E cycle.