[eventlet-removal]When to drop eventlet support
Hi Stackers! I would like to sync about the planned timeline of dropping eventlet support from OpenStack / Oslo. Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer. Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet. However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself. I see two ways forward: * A) We say that operators who want to do the concurrency mode change outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1. * B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1. What do you think? Cheers gibi
On 6/13/25 5:08 AM, Balazs Gibizer wrote:
Hi Stackers!
I would like to sync about the planned timeline of dropping eventlet support from OpenStack / Oslo.
Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer.
Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet.
However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself.
I see two ways forward: * A) We say that operators who want to do the concurrency mode change outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1. * B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1.
Keeping eventlet running for that long is not something that is a worthy investment of time. The oslo libraries are showing a deprecation of 2026.2, I've been using that date as the target for all Ironic work as well. Beyond the oslo team (who I don't speak for), there are folks -- like Itamar on behalf of GR-OSS -- who are doing work behind the scenes to keep eventlet running - barely. I do not expect the GR-OSS investment in this work to extend much past the midpoint of 2026. My $.02, Jay Faulkner Open Source Developer G-Research Open Source Software
---- On Fri, 13 Jun 2025 08:33:25 -0700 Jay Faulkner <jay@gr-oss.io> wrote ---
On 6/13/25 5:08 AM, Balazs Gibizer wrote:
Hi Stackers!
I would like to sync about the planned timeline of dropping eventlet support from OpenStack / Oslo.
Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer.
Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet.
However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself.
I see two ways forward: * A) We say that operators who want to do the concurrency mode change outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1.
This has a big impact on upgrades and breaks our SLURP model.
* B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1.
I am in favour of this option. I was reading the goal doc about the timeline and found something in 'Completion Criteria' section[1] which says: - (2027.1) Get usage of Eventlet in oslo deliverables removed; - "(2027.2) Get Eventlet retired from OpenStack;" Again, 2027.2 (non-SLURP) is mentioned as eventlet retirement, I do not know if any technical reason to do it in non-SLURP or it can be moved to SLURP release. Maybe hberaud knows. Anyway, thanks gibi for bringing this. There are many projects that have not started the work yet (Nova might have more work compared to others), but I think we should discuss/re-discuss the timelines considering all projects/challenges. Accordingly, update the goal doc for the exact timelines and what the impact will be for projects that do not finish work as per the timelines (for example upgrade issue, workaround etc). [1] https://governance.openstack.org/tc/goals/selected/remove-eventlet.html#comp... -gmaan
Keeping eventlet running for that long is not something that is a worthy investment of time. The oslo libraries are showing a deprecation of 2026.2, I've been using that date as the target for all Ironic work as well.
Beyond the oslo team (who I don't speak for), there are folks -- like Itamar on behalf of GR-OSS -- who are doing work behind the scenes to keep eventlet running - barely. I do not expect the GR-OSS investment in this work to extend much past the midpoint of 2026.
My $.02,
Jay Faulkner Open Source Developer G-Research Open Source Software
On 13/06/2025 16:55, Ghanshyam Maan wrote:
---- On Fri, 13 Jun 2025 08:33:25 -0700 Jay Faulkner <jay@gr-oss.io> wrote ---
On 6/13/25 5:08 AM, Balazs Gibizer wrote:
Hi Stackers!
I would like to sync about the planned timeline of dropping eventlet support from OpenStack / Oslo.
Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer.
Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet.
However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself.
yes i think that how it should work if an only if a givne service is capable of working without eventlet in the prior slurp. if we dont get to the point of being able to run in threaded mode for a given service in 2026.1 then the deprecate of eventlet for tha service will have to be released in the next slurp (2027.1) before the removal of support can be done in the next non slupr 2027.2
I see two ways forward: * A) We say that operators who want to do the concurrency mode change outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1.
This has a big impact on upgrades and breaks our SLURP model.
* B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1.
no we can support it in 2027.1 with all serives defaulting ot not using it if its congurable and deprecate it then remove in 2027.2 we do not need to extned to 2028.1
I am in favour of this option.
I was reading the goal doc about the timeline and found something in 'Completion Criteria' section[1] which says: - (2027.1) Get usage of Eventlet in oslo deliverables removed;
so in my mind 2027.1 was either gonig to be the first release without eventlet support or the last release to supprot eventlet mode.(if supported in 2027.1 eventlet mode would not be the default mode and would be deprecated for removal in 2027.2. my expectation is the first release of nova to support running in thread mode would be 2026.1 but im not convinced we will be confident enough to switch threaded mode on by default for all nova services in 2026.1. basically i think it may be funcitonal but have some bugs or performace issues that will be refined in 2026.2 wehre we will look to change the default. then in 2027.1 we will either start removing the option to run in eventlet mode or deprecate that and remove it in 2027.2 my logic is this if we can run in threaded mode in 2026.1 in all service we will likely deprecate eventlet mode in 2026.1, but im not sure we will change the default and deprecate in the same release. nova may decied to take a per bindary (api vs schduler) approch to changing the default and doign the eventlet removal too but we have not discussed that in detail. for example if the nova-schduler is able to run in thread mode this cycle. we could deprecate runing it in eventlet mdoe in 2026.1 and make threaded mode the default however we will still need eventlet for nova-compute. so we have a choice to make, only deprecate using eventlet in nova once all nova service are ready or do it per sevice as each part is ready.
- "(2027.2) Get Eventlet retired from OpenStack;"
Again, 2027.2 (non-SLURP) is mentioned as eventlet retirement, I do not know if any technical reason to do it in non-SLURP or it can be moved to SLURP release. Maybe hberaud knows.
i dont see doing this in a non slurp as a problem under our upgrade policy. once we have supprot for runnign without it in a preor slurp even if its not the defautl we are free to deprecate eventlet supprot in that slurp and remove it in a non slurp. to me this jsut reads as we deprecate in 2027.1 annoching it will be removed in a future release adn then we do the removal in the non slurp i think doing the actual removal in a non slurp is better then doing it in a slurp. i.e. the deprecations happen in the slurps, the changes of defaults and remvoal of code happens in the non slurps. operators who want to move fast or use newer python release can adopt the new defaults earlier in the non sluprts those that want to move slow get longer for the new mode to bake by skiping the release and getting the concuracy model change on upgrade.
Anyway, thanks gibi for bringing this. There are many projects that have not started the work yet (Nova might have more work compared to others), but I think we should discuss/re-discuss the timelines considering all projects/challenges. Accordingly, update the goal doc for the exact timelines and what the impact will be for projects that do not finish work as per the timelines (for example upgrade issue, workaround etc).
[1] https://governance.openstack.org/tc/goals/selected/remove-eventlet.html#comp...
-gmaan
Keeping eventlet running for that long is not something that is a worthy investment of time. The oslo libraries are showing a deprecation of 2026.2, I've been using that date as the target for all Ironic work as well.
Beyond the oslo team (who I don't speak for), there are folks -- like Itamar on behalf of GR-OSS -- who are doing work behind the scenes to keep eventlet running - barely. I do not expect the GR-OSS investment in this work to extend much past the midpoint of 2026.
My $.02,
Jay Faulkner Open Source Developer G-Research Open Source Software
I'm confused a bit -- the implementation details of our threading modules are not a public API that we owe deprecation periods for. Why are we treating it as such? -JayF On Fri, Jun 13, 2025 at 11:02 AM Sean Mooney <smooney@redhat.com> wrote:
On 13/06/2025 16:55, Ghanshyam Maan wrote:
---- On Fri, 13 Jun 2025 08:33:25 -0700 Jay Faulkner <jay@gr-oss.io> wrote ---
On 6/13/25 5:08 AM, Balazs Gibizer wrote:
Hi Stackers!
I would like to sync about the planned timeline of dropping
eventlet
support from OpenStack / Oslo.
Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer.
Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet.
However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself.
yes i think that how it should work if an only if a givne service is capable of
working without eventlet in the prior slurp.
if we dont get to the point of being able to run in threaded mode for a given service
in 2026.1 then the deprecate of eventlet for tha service will have to be released in the next slurp (2027.1) before the removal of support can be done in the next non slupr 2027.2
I see two ways forward: * A) We say that operators who want to do the concurrency mode
change
outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1.
This has a big impact on upgrades and breaks our SLURP model.
* B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1.
no we can support it in 2027.1 with all serives defaulting ot not using it if
its congurable and deprecate it then remove in 2027.2
we do not need to extned to 2028.1
I am in favour of this option.
I was reading the goal doc about the timeline and found something in 'Completion Criteria' section[1] which says: - (2027.1) Get usage of Eventlet in oslo deliverables removed;
so in my mind 2027.1 was either gonig to be the first release without eventlet support
or the last release to supprot eventlet mode.(if supported in 2027.1 eventlet mode would not be the default mode
and would be deprecated for removal in 2027.2.
my expectation is the first release of nova to support running in thread mode would be 2026.1 but im not convinced we will be confident enough to switch threaded mode on by default for all nova services in 2026.1. basically i think it may be funcitonal but have some bugs or performace issues that will be refined in 2026.2 wehre we will look to change the default.
then in 2027.1 we will either start removing the option to run in eventlet mode
or deprecate that and remove it in 2027.2
my logic is this if we can run in threaded mode in 2026.1 in all service we will likely deprecate
eventlet mode in 2026.1, but im not sure we will change the default and deprecate in the same release.
nova may decied to take a per bindary (api vs schduler) approch to changing the default and doign the eventlet
removal too but we have not discussed that in detail.
for example if the nova-schduler is able to run in thread mode this cycle. we could deprecate runing it in eventlet mdoe in 2026.1 and make threaded mode the default however we will still need eventlet for nova-compute.
so we have a choice to make, only deprecate using eventlet in nova once all nova service are ready
or do it per sevice as each part is ready.
- "(2027.2) Get Eventlet retired from OpenStack;"
Again, 2027.2 (non-SLURP) is mentioned as eventlet retirement, I do not know if any technical reason to do it in non-SLURP or it can be moved to SLURP release. Maybe hberaud knows.
i dont see doing this in a non slurp as a problem under our upgrade policy.
once we have supprot for runnign without it in a preor slurp even if its not the defautl we
are free to deprecate eventlet supprot in that slurp and remove it in a non slurp.
to me this jsut reads as we deprecate in 2027.1 annoching it will be removed in a future release
adn then we do the removal in the non slurp
i think doing the actual removal in a non slurp is better then doing it in a slurp.
i.e. the deprecations happen in the slurps, the changes of defaults and remvoal of code happens in the non slurps.
operators who want to move fast or use newer python release can adopt the new defaults earlier in the non sluprts
those that want to move slow get longer for the new mode to bake by skiping the release and getting the concuracy
model change on upgrade.
Anyway, thanks gibi for bringing this. There are many projects that have
work yet (Nova might have more work compared to others), but I think we should discuss/re-discuss the timelines considering all projects/challenges. Accordingly, update
and what the impact will be for projects that do not finish work as per
not started the the goal doc for the exact timelines the timelines (for example upgrade
issue, workaround etc).
[1] https://governance.openstack.org/tc/goals/selected/remove-eventlet.html#comp...
-gmaan
Keeping eventlet running for that long is not something that is a
worthy
investment of time. The oslo libraries are showing a deprecation of 2026.2, I've been using that date as the target for all Ironic work as well.
Beyond the oslo team (who I don't speak for), there are folks -- like Itamar on behalf of GR-OSS -- who are doing work behind the scenes to keep eventlet running - barely. I do not expect the GR-OSS investment in this work to extend much past the midpoint of 2026.
My $.02,
Jay Faulkner Open Source Developer G-Research Open Source Software
On Jun 13, 2025 20:52, Jay Faulkner <jay@gr-oss.io> wrote:
I'm confused a bit -- the implementation details of our threading modules are not a public API that we owe deprecation periods for. Why are we treating it as such?
-JayF
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors. Cheers, Thomas Goirand (zigo)
On Sat, 14 Jun 2025, 01:24 , <thomas@goirand.fr> wrote:
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors. Cheers,
Well, at least some class of bugs might be already known and operators might be willing to accept them, while the new ones could be not known by project maintainers and take quite some time to realize and patch. So being able to rollback to old behavior might save the day. And an extremely good example for that is heartbeat_in_pthread, from oslo.messaging, for which we've flipped default one day and mark it as deprecated, untill realized it had a terrible consequences for WSGI applications a couple of cycles later, so default had to be flipped back, while operators were dealing/work-arounding with consequences by changing the option back and selecting old buggy over the new ones, as the service was still working more reliably with old ones. Thomas Goirand (zigo)
On 6/14/25 05:07, Dmitriy Rabotyagov wrote:
On Sat, 14 Jun 2025, 01:24 , <thomas@goirand.fr <mailto:thomas@goirand.fr>> wrote:
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors. Cheers,
Well, at least some class of bugs might be already known and operators might be willing to accept them, while the new ones could be not known by project maintainers and take quite some time to realize and patch.
So being able to rollback to old behavior might save the day.
This is FUD. IMO, either upstream OpenStack provides a working setup, either it's not. In the later case, your only available action is help fixing bugs. It is not up to the operators to double-guess what may or may not work. For beginners, this would be a horrible nightmare if default options simply wouldn't work. We *must* ship OpenStack working by default. Cheers, Thomas Goirand (zigo)
On 16/06/2025 13:09, Thomas Goirand wrote:
On 6/14/25 05:07, Dmitriy Rabotyagov wrote:
On Sat, 14 Jun 2025, 01:24 , <thomas@goirand.fr <mailto:thomas@goirand.fr>> wrote:
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors. Cheers,
Well, at least some class of bugs might be already known and operators might be willing to accept them, while the new ones could be not known by project maintainers and take quite some time to realize and patch.
So being able to rollback to old behavior might save the day.
This is FUD. IMO, either upstream OpenStack provides a working setup, either it's not.
sayint its FUD is not helpful. we got a driect ask form operator and soem core to not do a hard switch over. and while i wanted to only support one model for each binary at a time we were sepcificly ask to make it configurable.
In the later case, your only available action is help fixing bugs. It is not up to the operators to double-guess what may or may not work.
correct we are not planning to document how to change mode we were planning to only use this configuration in ci and operator would be told for a given release deploy this way. this is an internal impelation detail however we are not prepared to deprecate usign eventlet until we are convicned that we can run properly without it.
For beginners, this would be a horrible nightmare if default options simply wouldn't work. We *must* ship OpenStack working by default. no one is suggesting we do otherwise.
Cheers,
Thomas Goirand (zigo)
sayint its FUD is not helpful.
we got a driect ask form operator and soem core to not do a hard switch over.
and while i wanted to only support one model for each binary at a time we were sepcificly ask to make it configurable.
In the later case, your only available action is help fixing bugs. It is not up to the operators to double-guess what may or may not work.
correct we are not planning to document how to change mode we were planning to only use this configuration in ci and operator would be
Well, we'd need to have that communicated so that deployment toolings could adopt their setup to changes, as, for instance, in OSA amount of eventlet workers are calculated based on the system facts, so we'd need to change the logic and also suggest how users should treat this new logic for their systems. So it will be kinda documented in a way after all.
told for a given release deploy this way.
this is an internal impelation detail however we are not prepared to deprecate usign eventlet until we are convicned
that we can run properly without it.
For beginners, this would be a horrible nightmare if default options simply wouldn't work. We *must* ship OpenStack working by default. no one is suggesting we do otherwise.
Cheers,
Thomas Goirand (zigo)
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote:
sayint its FUD is not helpful.
we got a driect ask form operator and soem core to not do a hard switch over.
and while i wanted to only support one model for each binary at a time we were sepcificly ask to make it configurable.
> In the later case, your only available action is help fixing bugs. It > is not up to the operators to double-guess what may or may not work.
correct we are not planning to document how to change mode we were planning to only use this configuration in ci and operator would be
Well, we'd need to have that communicated so that deployment toolings could adopt their setup to changes, as, for instance, in OSA amount of eventlet workers are calculated based on the system facts, so we'd need to change the logic and also suggest how users should treat this new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default. we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores. for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box. can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells. using information about the host where the API is deployed to infer the value of that would be incorrect. you can really only make an informed decision about how to tune that based on monitoring the usage of the pool. that how we expect most of the other tuning options to go as well. our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people. gibi id building up https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con... as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
So it will be kinda documented in a way after all.
told for a given release deploy this way.
this is an internal impelation detail however we are not prepared to deprecate usign eventlet until we are convicned
that we can run properly without it.
> For beginners, this would be a horrible nightmare if default options > simply wouldn't work. We *must* ship OpenStack working by default. no one is suggesting we do otherwise. > > Cheers, > > Thomas Goirand (zigo) >
In case you try to use a 32gb box with 16 cores as a controller for OpenStack - it will blow off with default amount of workers for wsgi and /or eventlet apps. While you can argue this should not be used as production setup, this can be totally valid for sandboxes and we wanna provide consistent and reliable behavior for users. But my argument was not in if/how we want to fine-tune deployments, but also understand and provide means to define what's needed as well as potential ability to revert in worst case scenario as a temporary workaround. So still some variables and logic would be introduced from what I understand today. On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote:
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote:
sayint its FUD is not helpful.
we got a driect ask form operator and soem core to not do a hard switch over.
and while i wanted to only support one model for each binary at a time we were sepcificly ask to make it configurable.
> In the later case, your only available action is help fixing bugs. It > is not up to the operators to double-guess what may or may not work.
correct we are not planning to document how to change mode we were planning to only use this configuration in ci and operator would be
Well, we'd need to have that communicated so that deployment toolings could adopt their setup to changes, as, for instance, in OSA amount of eventlet workers are calculated based on the system facts, so we'd need to change the logic and also suggest how users should treat this new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default.
we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores.
for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box.
can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells.
using information about the host where the API is deployed to infer the value of that would be incorrect.
you can really only make an informed decision about how to tune that based on monitoring the usage of the pool.
that how we expect most of the other tuning options to go as well.
our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people.
gibi id building up
https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
So it will be kinda documented in a way after all.
told for a given release deploy this way.
this is an internal impelation detail however we are not prepared to deprecate usign eventlet until we are convicned
that we can run properly without it.
> For beginners, this would be a horrible nightmare if default options > simply wouldn't work. We *must* ship OpenStack working by default. no one is suggesting we do otherwise. > > Cheers, > > Thomas Goirand (zigo) >
Also let's keep in mind, that only nova (with placement) will spawn 64 threads on such setup by default. And then all really depends on set of services to launch on such setup. So from deployment tooling protective you have all required data to rollout not instantly oom-ing setup at cost of amount of request services can process in parallel. On Mon, 16 Jun 2025, 15:24 Dmitriy Rabotyagov, <noonedeadpunk@gmail.com> wrote:
In case you try to use a 32gb box with 16 cores as a controller for OpenStack - it will blow off with default amount of workers for wsgi and /or eventlet apps.
While you can argue this should not be used as production setup, this can be totally valid for sandboxes and we wanna provide consistent and reliable behavior for users.
But my argument was not in if/how we want to fine-tune deployments, but also understand and provide means to define what's needed as well as potential ability to revert in worst case scenario as a temporary workaround. So still some variables and logic would be introduced from what I understand today.
On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote:
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote:
sayint its FUD is not helpful.
we got a driect ask form operator and soem core to not do a hard switch over.
and while i wanted to only support one model for each binary at a time we were sepcificly ask to make it configurable.
> In the later case, your only available action is help fixing bugs. It > is not up to the operators to double-guess what may or may not work.
correct we are not planning to document how to change mode we were planning to only use this configuration in ci and operator would be
Well, we'd need to have that communicated so that deployment toolings could adopt their setup to changes, as, for instance, in OSA amount of eventlet workers are calculated based on the system facts, so we'd need to change the logic and also suggest how users should treat this new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default.
we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores.
for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box.
can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells.
using information about the host where the API is deployed to infer the value of that would be incorrect.
you can really only make an informed decision about how to tune that based on monitoring the usage of the pool.
that how we expect most of the other tuning options to go as well.
our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people.
gibi id building up
https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
So it will be kinda documented in a way after all.
told for a given release deploy this way.
this is an internal impelation detail however we are not prepared to deprecate usign eventlet until we are convicned
that we can run properly without it.
> For beginners, this would be a horrible nightmare if default options > simply wouldn't work. We *must* ship OpenStack working by default. no one is suggesting we do otherwise. > > Cheers, > > Thomas Goirand (zigo) >
On 16/06/2025 15:33, Dmitriy Rabotyagov wrote:
Also let's keep in mind, that only nova (with placement) will spawn 64 threads on such setup by default.
yep see my other reply. you mixign up workers and eventlet threads. those are two very diffent things. we likely shoudl change the upstream default for workers to 1. we do tend to override that by defualt in installer tools.
And then all really depends on set of services to launch on such setup.
So from deployment tooling protective you have all required data to rollout not instantly oom-ing setup at cost of amount of request services can process in parallel.
On Mon, 16 Jun 2025, 15:24 Dmitriy Rabotyagov, <noonedeadpunk@gmail.com> wrote:
In case you try to use a 32gb box with 16 cores as a controller for OpenStack - it will blow off with default amount of workers for wsgi and /or eventlet apps.
While you can argue this should not be used as production setup, this can be totally valid for sandboxes and we wanna provide consistent and reliable behavior for users.
But my argument was not in if/how we want to fine-tune deployments, but also understand and provide means to define what's needed as well as potential ability to revert in worst case scenario as a temporary workaround. So still some variables and logic would be introduced from what I understand today.
On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote:
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote: > > > > sayint its FUD is not helpful. > > we got a driect ask form operator and soem core to not do a hard > switch > over. > > and while i wanted to only support one model for each binary at a > time > we were sepcificly ask to make it configurable. > > > In the later case, your only available action is help fixing > bugs. It > > is not up to the operators to double-guess what may or may not > work. > > correct we are not planning to document how to change mode we were > planning to only use this configuration in ci and operator would be > > > Well, we'd need to have that communicated so that deployment toolings > could adopt their setup to changes, as, for instance, in OSA amount of > eventlet workers are calculated based on the system facts, so we'd > need to change the logic and also suggest how users should treat this > new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default.
we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores.
for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box.
can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells.
using information about the host where the API is deployed to infer the value of that would be incorrect.
you can really only make an informed decision about how to tune that based on monitoring the usage of the pool.
that how we expect most of the other tuning options to go as well.
our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people.
gibi id building up https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
> > So it will be kinda documented in a way after all. > > > told for a given release deploy this way. > > this is an internal impelation detail however we are not prepared to > deprecate usign eventlet until we are convicned > > that we can run properly without it. > > > For beginners, this would be a horrible nightmare if default > options > > simply wouldn't work. We *must* ship OpenStack working by default. > no one is suggesting we do otherwise. > > > > Cheers, > > > > Thomas Goirand (zigo) > > >
I think there is an option C. Since the beginning of this initiative, the goal has always been to phase out Eventlet cleanly from OpenStack components, particularly Nova. The initial proposal targeted the 2027.2 release, at a time when Nova had no plans to support a dual concurrency model. At that point, Eventlet removal was not classified as a feature removal subject to SLURP policy - it was considered an internal refactoring, with no addition of transitional features requiring later removal. The idea was to deprecate any Eventlet-related features, public APIs, and configuration options as early as possible. As a result, SLURP policy constraints were not a primary concern. But the situation has evolved. Nova has now committed to offering a full release cycle (2027.1) where both concurrency models - Eventlet and threading - are supported. This is not a technical convenience, but a response to explicit operator requests: operators want the ability to switch concurrency modes outside in case of problem during at least one series. The full threading support in Nova will not be ready before 2026.2, which is a non-SLURP release, and that the dual-mode support must be preserved throughout 2027.1. However, current SLURP policy requires that a removal can only happen in a SLURP release. According to Nova's plan that excludes a removal in 2027.1. If we follow the rules strictly, Eventlet removal would be delayed to 2028.1, adding several months of dual-mode support, increased maintenance complexity, and growing technical debt. From a technical standpoint, it’s important to highlight that upcoming Python versions (3.14 and 3.15) introduce major changes to the GIL [1], threading behavior, and RLocks - all areas [2] that have historically been problematic in Eventlet. We’ve already had to implement significant patches for Python 3.13. There is also uncertainty around which Python versions will be adopted by the various Linux distributions and by our own supported runtime environments, making long-term compatibility assumptions with Eventlet increasingly risky, but in all the cases it will put additional pressure on us. Continuing to support Eventlet until 2028 is effectively betting on the stability of a module that is already on life support, with very few contributors available to maintain it beyond 2027 (see Jay's note about GR-OSS). More critically, there is no guarantee today that Eventlet, in its current state, will remain robust enough to reliably support a fallback path in case a regression forces a rollback from threading to Eventlet. This is why I propose a pragmatic and balanced compromise: we maintain the planned deprecation (we might have to adjust a little bit those from oslo), but we make a documented and limited exception to allow Eventlet removal in 2027.2, which is a non-SLURP cycle. This exception would be coordinated with the Technical Committee and the release team and is justified by: - the commitment made by Nova to offer a full cycle of dual-mode support in 2027.1; - the high technical cost of maintaining dual concurrency models; - the fast evolution of Python and the increasing risk of incompatibilities with Eventlet; - the limited engineering resources available to sustain Eventlet maintenance; - and the need to align with operator expectations while still securing the long-term health of the platform. This approach gives Nova the entire 2027.1 cycle to deliver a complete and stable dual-mode implementation, as promised to operators. At the same time, it enables the rest of the ecosystem to move forward safely, without relying on a dependency that is increasingly brittle and obsolete. In large open source projects, it is not uncommon for policies to be temporarily adjusted in the face of structural or technical emergencies. This is not a convenience exception - it’s a reasoned and responsible deviation from the norm, driven by necessity. For this reason I add the release team and the TC to the party. Please share your feedback with us. [1] https://docs.python.org/es/dev/whatsnew/3.14.html#whatsnew314-free-threaded-... [2] https://docs.python.org/es/dev/whatsnew/3.15.html#threading. Le lun. 16 juin 2025 à 18:06, Sean Mooney <smooney@redhat.com> a écrit :
On 16/06/2025 15:33, Dmitriy Rabotyagov wrote:
Also let's keep in mind, that only nova (with placement) will spawn 64 threads on such setup by default.
yep see my other reply.
you mixign up workers and eventlet threads.
those are two very diffent things.
we likely shoudl change the upstream default for workers to 1.
we do tend to override that by defualt in installer tools.
And then all really depends on set of services to launch on such setup.
So from deployment tooling protective you have all required data to rollout not instantly oom-ing setup at cost of amount of request services can process in parallel.
On Mon, 16 Jun 2025, 15:24 Dmitriy Rabotyagov, <noonedeadpunk@gmail.com> wrote:
In case you try to use a 32gb box with 16 cores as a controller for OpenStack - it will blow off with default amount of workers for wsgi and /or eventlet apps.
While you can argue this should not be used as production setup, this can be totally valid for sandboxes and we wanna provide consistent and reliable behavior for users.
But my argument was not in if/how we want to fine-tune deployments, but also understand and provide means to define what's needed as well as potential ability to revert in worst case scenario as a temporary workaround. So still some variables and logic would be introduced from what I understand today.
On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote:
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote: > > > > sayint its FUD is not helpful. > > we got a driect ask form operator and soem core to not do a hard > switch > over. > > and while i wanted to only support one model for each binary at a > time > we were sepcificly ask to make it configurable. > > > In the later case, your only available action is help fixing > bugs. It > > is not up to the operators to double-guess what may or may not > work. > > correct we are not planning to document how to change mode we were > planning to only use this configuration in ci and operator would be > > > Well, we'd need to have that communicated so that deployment toolings > could adopt their setup to changes, as, for instance, in OSA amount of > eventlet workers are calculated based on the system facts, so we'd > need to change the logic and also suggest how users should treat this > new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default.
we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores.
for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box.
can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells.
using information about the host where the API is deployed to infer the value of that would be incorrect.
you can really only make an informed decision about how to tune that based on monitoring the usage of the pool.
that how we expect most of the other tuning options to go as
well.
our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people.
gibi id building up
https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
> > So it will be kinda documented in a way after all. > > > told for a given release deploy this way. > > this is an internal impelation detail however we are not prepared to > deprecate usign eventlet until we are convicned > > that we can run properly without it. > > > For beginners, this would be a horrible nightmare if default > options > > simply wouldn't work. We *must* ship OpenStack working by default. > no one is suggesting we do otherwise. > > > > Cheers, > > > > Thomas Goirand (zigo) > > >
-- Hervé Beraud Principal Software Engineer at Red Hat irc: hberaud https://github.com/4383/
On 17/06/2025 18:52, Herve Beraud wrote:
I think there is an option C.
Since the beginning of this initiative, the goal has always been to phase out Eventlet cleanly from OpenStack components, particularly Nova. The initial proposal targeted the 2027.2 release, at a time when Nova had no plans to support a dual concurrency model. At that point, Eventlet removal was not classified as a feature removal subject to SLURP policy - it was considered an internal refactoring, with no addition of transitional features requiring later removal. The idea was to deprecate any Eventlet-related features, public APIs, and configuration options as early as possible. As a result, SLURP policy constraints were not a primary concern.
But the situation has evolved. Nova has now committed to offering a full release cycle (2027.1) where both concurrency models - Eventlet and threading - are supported. This is not a technical convenience, but a response to explicit operator requests: operators want the ability to switch concurrency modes outside in case of problem during at least one series. The full threading support in Nova will not be ready before 2026.2, which is a non-SLURP release, and that the dual-mode support must be preserved throughout 2027.1.
unless this has changed since the ptg we were hoping for the first release to supprot runign without eventlet to be 2026.1 and the defautl to chagne in 2026.2. so 2026.1 would be the slrup release where operator woudl chage to eventlet before upgrading to 2027.1
However, current SLURP policy requires that a removal can only happen in a SLURP release.
That is incorrect. deprecation can only happen in slurps removals can happen in either provided the feature is deprecated in the prior slurp.
According to Nova's plan that excludes a removal in 2027.1.
that is also incorrect removal is only exclded if we dont get to full threadign supprot in 2026.1 we have not deprecated evenlet supprot yet and we dont plan do do that formally until we have show that nova can work without it.
If we follow the rules strictly, Eventlet removal would be delayed to 2028.1, adding several months of dual-mode support, increased maintenance complexity, and growing technical debt. again this is very much misrepresenting the current plan.
From a technical standpoint, it’s important to highlight that upcoming Python versions (3.14 and 3.15) introduce major changes to the GIL [1], threading behavior, and RLocks - all areas [2] that have historically been problematic in Eventlet. We’ve already had to implement significant patches for Python 3.13. There is also uncertainty around which Python versions will be adopted by the various Linux distributions and by our own supported runtime environments, making long-term compatibility assumptions with Eventlet increasingly risky, but in all the cases it will put additional pressure on us. Continuing to support Eventlet until 2028 is effectively betting on the stability of a module that is already on life support, with very few contributors available to maintain it beyond 2027 (see Jay's note about GR-OSS). More critically, there is no guarantee today that Eventlet, in its current state, will remain robust enough to reliably support a fallback path in case a regression forces a rollback from threading to Eventlet.
This is why I propose a pragmatic and balanced compromise: we maintain the planned deprecation (we might have to adjust a little bit those from oslo), but we make a documented and limited exception to allow Eventlet removal in 2027.2, which is a non-SLURP cycle. This exception would be coordinated with the Technical Committee and the release team and is justified by:
- the commitment made by Nova to offer a full cycle of dual-mode support in 2027.1;
again that was not what we commited too. we aim to provide that in 2026.1 if we dont make that teh deprecation fo eventlet supprot will slip to 2026.2 and we will supprot both in 2027.1 and do the removal in 2027.2. supprot for eventlet in 2027.1 is predicated in not completeing the work in 2026.1.
- the high technical cost of maintaining dual concurrency models; - the fast evolution of Python and the increasing risk of incompatibilities with Eventlet; - the limited engineering resources available to sustain Eventlet maintenance; - and the need to align with operator expectations while still securing the long-term health of the platform.
This approach gives Nova the entire 2027.1 cycle to deliver a complete and stable dual-mode implementation, as promised to operators. At the same time, it enables the rest of the ecosystem to move forward safely, without relying on a dependency that is increasingly brittle and obsolete.
In large open source projects, it is not uncommon for policies to be temporarily adjusted in the face of structural or technical emergencies. This is not a convenience exception - it’s a reasoned and responsible deviation from the norm, driven by necessity.
For this reason I add the release team and the TC to the party. Please share your feedback with us.
[1] https://docs.python.org/es/dev/whatsnew/3.14.html#whatsnew314-free-threaded-... [2] https://docs.python.org/es/dev/whatsnew/3.15.html#threading.
Le lun. 16 juin 2025 à 18:06, Sean Mooney <smooney@redhat.com> a écrit :
On 16/06/2025 15:33, Dmitriy Rabotyagov wrote: > Also let's keep in mind, that only nova (with placement) will spawn 64 > threads on such setup by default.
yep see my other reply.
you mixign up workers and eventlet threads.
those are two very diffent things.
we likely shoudl change the upstream default for workers to 1.
we do tend to override that by defualt in installer tools.
> > And then all really depends on set of services to launch on such setup. > > So from deployment tooling protective you have all required data to > rollout not instantly oom-ing setup at cost of amount of request > services can process in parallel. > > On Mon, 16 Jun 2025, 15:24 Dmitriy Rabotyagov, > <noonedeadpunk@gmail.com> wrote: > > In case you try to use a 32gb box with 16 cores as a controller > for OpenStack - it will blow off with default amount of workers > for wsgi and /or eventlet apps. > > While you can argue this should not be used as production setup, > this can be totally valid for sandboxes and we wanna provide > consistent and reliable behavior for users. > > But my argument was not in if/how we want to fine-tune > deployments, but also understand and provide means to define > what's needed as well as potential ability to revert in worst case > scenario as a temporary workaround. > So still some variables and logic would be introduced from what I > understand today. > > > On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote: > > > On 16/06/2025 13:27, Dmitriy Rabotyagov wrote: > > > > > > > > sayint its FUD is not helpful. > > > > we got a driect ask form operator and soem core to not > do a hard > > switch > > over. > > > > and while i wanted to only support one model for each > binary at a > > time > > we were sepcificly ask to make it configurable. > > > > > In the later case, your only available action is help > fixing > > bugs. It > > > is not up to the operators to double-guess what may or > may not > > work. > > > > correct we are not planning to document how to change > mode we were > > planning to only use this configuration in ci and > operator would be > > > > > > Well, we'd need to have that communicated so that deployment > toolings > > could adopt their setup to changes, as, for instance, in OSA > amount of > > eventlet workers are calculated based on the system facts, > so we'd > > need to change the logic and also suggest how users should > treat this > > new logic for their systems. > > why is OSA doing that at all today? > we generally don't recommend changing those values from the > default > unless you really know what your doing. > i don't think other installer do that. > tripleo, kolla-ansbile and our new golang based installer do > not, nor > does devstack so its surprising to me that OSA would change > such low > level values > by default. > > we will document any new config options we and and we are > documentation > how to tune the new options for thread pools but we do not expect > installation > tools to modify them by default. we are explicitly not making the > options based on the amount of resources on the host i.e. > dynamically > calculated based > on the number of CPU cores. > > for example we are explicitly setting the number of > scatter_gather > thread in the the dedicated thread pool to 5 > why its a nice small number that will work for most people out > of the box. > > can you adjust it, yes but it scale based on the number of > nova cells > you have an 99% wont have more then 5 cells. > > using information about the host where the API is deployed to > infer the > value of that would be incorrect. > > you can really only make an informed decision about how to > tune that > based on monitoring the usage of the pool. > > that how we expect most of the other tuning options to go as well. > > our defaults in nova tend to be higher then you would actually > need in a > real environment so while it may make sense to reduce > them we try to make sure the work out of the box for most people. > > gibi id building up > https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con... > > as part of nova move to encode this but our goal is that > deployment > tools shoudl not need to be modifyed to tune these > valued by defualt. > > > > > So it will be kinda documented in a way after all. > > > > > > told for a given release deploy this way. > > > > this is an internal impelation detail however we are not > prepared to > > deprecate usign eventlet until we are convicned > > > > that we can run properly without it. > > > > > For beginners, this would be a horrible nightmare if > default > > options > > > simply wouldn't work. We *must* ship OpenStack working > by default. > > no one is suggesting we do otherwise. > > > > > > Cheers, > > > > > > Thomas Goirand (zigo) > > > > > >
-- Hervé Beraud Principal Software Engineer at Red Hat irc: hberaud https://github.com/4383/
Thanks Sean for clarifications. Le mar. 17 juin 2025 à 20:22, Sean Mooney <smooney@redhat.com> a écrit :
On 17/06/2025 18:52, Herve Beraud wrote:
I think there is an option C.
Since the beginning of this initiative, the goal has always been to phase out Eventlet cleanly from OpenStack components, particularly Nova. The initial proposal targeted the 2027.2 release, at a time when Nova had no plans to support a dual concurrency model. At that point, Eventlet removal was not classified as a feature removal subject to SLURP policy - it was considered an internal refactoring, with no addition of transitional features requiring later removal. The idea was to deprecate any Eventlet-related features, public APIs, and configuration options as early as possible. As a result, SLURP policy constraints were not a primary concern.
But the situation has evolved. Nova has now committed to offering a full release cycle (2027.1) where both concurrency models - Eventlet and threading - are supported. This is not a technical convenience, but a response to explicit operator requests: operators want the ability to switch concurrency modes outside in case of problem during at least one series. The full threading support in Nova will not be ready before 2026.2, which is a non-SLURP release, and that the dual-mode support must be preserved throughout 2027.1.
unless this has changed since the ptg we were hoping for the first release to supprot runign without eventlet to be 2026.1 and the defautl to chagne in 2026.2.
so 2026.1 would be the slrup release where operator woudl chage to eventlet before upgrading to 2027.1
However, current SLURP policy requires that a removal can only happen in a SLURP release.
That is incorrect. deprecation can only happen in slurps removals can happen in either provided the feature is deprecated in the prior slurp.
According to Nova's plan that excludes a removal in 2027.1.
that is also incorrect removal is only exclded if we dont get to full threadign supprot in 2026.1 we have not deprecated evenlet supprot
yet and we dont plan do do that formally until we have show that nova can work without it.
If we follow the rules strictly, Eventlet removal would be delayed to 2028.1, adding several months of dual-mode support, increased maintenance complexity, and growing technical debt. again this is very much misrepresenting the current plan.
From a technical standpoint, it’s important to highlight that upcoming Python versions (3.14 and 3.15) introduce major changes to the GIL [1], threading behavior, and RLocks - all areas [2] that have historically been problematic in Eventlet. We’ve already had to implement significant patches for Python 3.13. There is also uncertainty around which Python versions will be adopted by the various Linux distributions and by our own supported runtime environments, making long-term compatibility assumptions with Eventlet increasingly risky, but in all the cases it will put additional pressure on us. Continuing to support Eventlet until 2028 is effectively betting on the stability of a module that is already on life support, with very few contributors available to maintain it beyond 2027 (see Jay's note about GR-OSS). More critically, there is no guarantee today that Eventlet, in its current state, will remain robust enough to reliably support a fallback path in case a regression forces a rollback from threading to Eventlet.
This is why I propose a pragmatic and balanced compromise: we maintain the planned deprecation (we might have to adjust a little bit those from oslo), but we make a documented and limited exception to allow Eventlet removal in 2027.2, which is a non-SLURP cycle. This exception would be coordinated with the Technical Committee and the release team and is justified by:
- the commitment made by Nova to offer a full cycle of dual-mode support in 2027.1;
again that was not what we commited too. we aim to provide that in 2026.1 if we dont make that
teh deprecation fo eventlet supprot will slip to 2026.2 and we will supprot both in 2027.1 and do the removal in 2027.2.
supprot for eventlet in 2027.1 is predicated in not completeing the work in 2026.1.
- the high technical cost of maintaining dual concurrency models; - the fast evolution of Python and the increasing risk of incompatibilities with Eventlet; - the limited engineering resources available to sustain Eventlet maintenance; - and the need to align with operator expectations while still securing the long-term health of the platform.
This approach gives Nova the entire 2027.1 cycle to deliver a complete and stable dual-mode implementation, as promised to operators. At the same time, it enables the rest of the ecosystem to move forward safely, without relying on a dependency that is increasingly brittle and obsolete.
In large open source projects, it is not uncommon for policies to be temporarily adjusted in the face of structural or technical emergencies. This is not a convenience exception - it’s a reasoned and responsible deviation from the norm, driven by necessity.
For this reason I add the release team and the TC to the party. Please share your feedback with us.
[1]
https://docs.python.org/es/dev/whatsnew/3.14.html#whatsnew314-free-threaded-...
[2] https://docs.python.org/es/dev/whatsnew/3.15.html#threading.
Le lun. 16 juin 2025 à 18:06, Sean Mooney <smooney@redhat.com> a écrit :
On 16/06/2025 15:33, Dmitriy Rabotyagov wrote: > Also let's keep in mind, that only nova (with placement) will spawn 64 > threads on such setup by default.
yep see my other reply.
you mixign up workers and eventlet threads.
those are two very diffent things.
we likely shoudl change the upstream default for workers to 1.
we do tend to override that by defualt in installer tools.
> > And then all really depends on set of services to launch on such setup. > > So from deployment tooling protective you have all required data to > rollout not instantly oom-ing setup at cost of amount of request > services can process in parallel. > > On Mon, 16 Jun 2025, 15:24 Dmitriy Rabotyagov, > <noonedeadpunk@gmail.com> wrote: > > In case you try to use a 32gb box with 16 cores as a controller > for OpenStack - it will blow off with default amount of workers > for wsgi and /or eventlet apps. > > While you can argue this should not be used as production setup, > this can be totally valid for sandboxes and we wanna provide > consistent and reliable behavior for users. > > But my argument was not in if/how we want to fine-tune > deployments, but also understand and provide means to define > what's needed as well as potential ability to revert in worst case > scenario as a temporary workaround. > So still some variables and logic would be introduced from what I > understand today. > > > On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote: > > > On 16/06/2025 13:27, Dmitriy Rabotyagov wrote: > > > > > > > > sayint its FUD is not helpful. > > > > we got a driect ask form operator and soem core to not > do a hard > > switch > > over. > > > > and while i wanted to only support one model for each > binary at a > > time > > we were sepcificly ask to make it configurable. > > > > > In the later case, your only available action is help > fixing > > bugs. It > > > is not up to the operators to double-guess what may or > may not > > work. > > > > correct we are not planning to document how to change > mode we were > > planning to only use this configuration in ci and > operator would be > > > > > > Well, we'd need to have that communicated so that deployment > toolings > > could adopt their setup to changes, as, for instance, in OSA > amount of > > eventlet workers are calculated based on the system facts, > so we'd > > need to change the logic and also suggest how users should > treat this > > new logic for their systems. > > why is OSA doing that at all today? > we generally don't recommend changing those values from the > default > unless you really know what your doing. > i don't think other installer do that. > tripleo, kolla-ansbile and our new golang based installer do > not, nor > does devstack so its surprising to me that OSA would change > such low > level values > by default. > > we will document any new config options we and and we are > documentation > how to tune the new options for thread pools but we do not expect > installation > tools to modify them by default. we are explicitly not making the > options based on the amount of resources on the host i.e. > dynamically > calculated based > on the number of CPU cores. > > for example we are explicitly setting the number of > scatter_gather > thread in the the dedicated thread pool to 5 > why its a nice small number that will work for most people out > of the box. > > can you adjust it, yes but it scale based on the number of > nova cells > you have an 99% wont have more then 5 cells. > > using information about the host where the API is deployed to > infer the > value of that would be incorrect. > > you can really only make an informed decision about how to > tune that > based on monitoring the usage of the pool. > > that how we expect most of the other tuning options to go as well. > > our defaults in nova tend to be higher then you would actually > need in a > real environment so while it may make sense to reduce > them we try to make sure the work out of the box for most people. > > gibi id building up >
https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
> > as part of nova move to encode this but our goal is that > deployment > tools shoudl not need to be modifyed to tune these > valued by defualt. > > > > > So it will be kinda documented in a way after all. > > > > > > told for a given release deploy this way. > > > > this is an internal impelation detail however we are not > prepared to > > deprecate usign eventlet until we are convicned > > > > that we can run properly without it. > > > > > For beginners, this would be a horrible nightmare
if
> default > > options > > > simply wouldn't work. We *must* ship OpenStack working > by default. > > no one is suggesting we do otherwise. > > > > > > Cheers, > > > > > > Thomas Goirand (zigo) > > > > > >
-- Hervé Beraud Principal Software Engineer at Red Hat irc: hberaud https://github.com/4383/
-- Hervé Beraud Principal Software Engineer at Red Hat irc: hberaud https://github.com/4383/
On 16/06/2025 14:24, Dmitriy Rabotyagov wrote:
In case you try to use a 32gb box with 16 cores as a controller for OpenStack - it will blow off with default amount of workers for wsgi and /or eventlet apps.
i think you are conflating workers and eventlet tuneign which are two very diffent things. the default for nova api depens on how you deploy it but normaly you start with 1-2 worker process for the api. we do seam to be defaulting to 1 worker proceess per core for conductor, scheduler which likely shoudl be set to 1 aswell https://github.com/openstack-k8s-operators/nova-operator/blob/main/templates... https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/nova/te... https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/nova/te... those have nothing to do with eventlet however the only eventlet specific tunable nova has are the following https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.def... https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.syn... https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.exe... Thsese are what i and gibi were refering to when we said there will be new tuning option for theaded mode. those exisitng greenthread pools will be replaced with new executors that wil need to be configured diffently. the worker options are not being removed or changed as part of eventlet removal. although they probably should be updated to default to 1 instead of $(nproc) and then be overried by the deployer based on there own knowlage of the aviable resouces.
While you can argue this should not be used as production setup, this can be totally valid for sandboxes and we wanna provide consistent and reliable behavior for users.
But my argument was not in if/how we want to fine-tune deployments, but also understand and provide means to define what's needed as well as potential ability to revert in worst case scenario as a temporary workaround. So still some variables and logic would be introduced from what I understand today.
On Mon, 16 Jun 2025, 14:43 Sean Mooney, <smooney@redhat.com> wrote:
On 16/06/2025 13:27, Dmitriy Rabotyagov wrote: > > > > sayint its FUD is not helpful. > > we got a driect ask form operator and soem core to not do a hard > switch > over. > > and while i wanted to only support one model for each binary at a > time > we were sepcificly ask to make it configurable. > > > In the later case, your only available action is help fixing > bugs. It > > is not up to the operators to double-guess what may or may not > work. > > correct we are not planning to document how to change mode we were > planning to only use this configuration in ci and operator would be > > > Well, we'd need to have that communicated so that deployment toolings > could adopt their setup to changes, as, for instance, in OSA amount of > eventlet workers are calculated based on the system facts, so we'd > need to change the logic and also suggest how users should treat this > new logic for their systems.
why is OSA doing that at all today? we generally don't recommend changing those values from the default unless you really know what your doing. i don't think other installer do that. tripleo, kolla-ansbile and our new golang based installer do not, nor does devstack so its surprising to me that OSA would change such low level values by default.
we will document any new config options we and and we are documentation how to tune the new options for thread pools but we do not expect installation tools to modify them by default. we are explicitly not making the options based on the amount of resources on the host i.e. dynamically calculated based on the number of CPU cores.
for example we are explicitly setting the number of scatter_gather thread in the the dedicated thread pool to 5 why its a nice small number that will work for most people out of the box.
can you adjust it, yes but it scale based on the number of nova cells you have an 99% wont have more then 5 cells.
using information about the host where the API is deployed to infer the value of that would be incorrect.
you can really only make an informed decision about how to tune that based on monitoring the usage of the pool.
that how we expect most of the other tuning options to go as well.
our defaults in nova tend to be higher then you would actually need in a real environment so while it may make sense to reduce them we try to make sure the work out of the box for most people.
gibi id building up https://review.opendev.org/c/openstack/nova/+/949364/13/doc/source/admin/con...
as part of nova move to encode this but our goal is that deployment tools shoudl not need to be modifyed to tune these valued by defualt.
> > So it will be kinda documented in a way after all. > > > told for a given release deploy this way. > > this is an internal impelation detail however we are not prepared to > deprecate usign eventlet until we are convicned > > that we can run properly without it. > > > For beginners, this would be a horrible nightmare if default > options > > simply wouldn't work. We *must* ship OpenStack working by default. > no one is suggesting we do otherwise. > > > > Cheers, > > > > Thomas Goirand (zigo) > > >
On Sat, Jun 14, 2025 at 1:24 AM <thomas@goirand.fr> wrote:
On Jun 13, 2025 20:52, Jay Faulkner <jay@gr-oss.io> wrote:
I'm confused a bit -- the implementation details of our threading modules are not a public API that we owe deprecation periods for. Why are we treating it as such?
-JayF
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors.
The new concurrency model in nova (native threading) needs different performance tuning than the previous (eventlet). The cost of having 1000 eventlets is negligible but having 1000 threads to replace that will blow up the memory usage of the service. Operators expressed that having such tuning effort happening during upgrade without a temporary way back to the old model is scary. And honestly I agree. Similarly we expect nasty bugs in the new model as it is a significant architectural change. So having no way to go back to a known good state temporarily while the bug is fixed or worked around is scarry. Third, if we want to keep green CI while we are transforming nova services to the new model without keeping a big feature branch then we need to be able to land code that passes CI while things are half transformed. The only way we can do that is if we support both concurrency modes in parallel for a while. Cheers, gibi
Cheers,
Thomas Goirand (zigo)
On 16/06/2025 10:11, Balazs Gibizer wrote:
On Sat, Jun 14, 2025 at 1:24 AM <thomas@goirand.fr> wrote:
On Jun 13, 2025 20:52, Jay Faulkner <jay@gr-oss.io> wrote:
I'm confused a bit -- the implementation details of our threading modules are not a public API that we owe deprecation periods for. Why are we treating it as such?
-JayF Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors.
just to address one thing. we don't really intend to expose the configurabltiy to operators. we are building it in so that we (the core team) can test both version and choose when to move each component to the new mode. The environment setting could be set by operator to workaround bugs if/when they happen but our intent is we would choose the mode that it should be run in on a per binary basis and the env var will just be for our internal use. having it does provide use an escape hatch if there is high severity bug to revert back to the old mode of operation. we still have the ability to run os-vif in the cli mode using ovs-vsctl instead of the ovs python bindings https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L72-L82 that was vital when ovs changed there implementation such that a reconnect would block the nova-compute agent for multiples seconds. ironically that was also eventlet related but having the old, venerable, slow cli based driver as a fallback mitigated most of the impact until the ovs c and python bidning could be fixed. that took the better part of a year to do and have it released/ backpored. im not saying it will take use the same amount of time if we have a bug in the threading mode but its possible. we reported the eventlet related concurrency bug on 2021-05-24 https://bugs.launchpad.net/os-vif/+bug/1929446 the fix in ovsdbapp merved on Dec 2, 2021 https://github.com/openstack/ovsdbapp/commit/a2d3ef2a6491eb63b5ee961fc930070... and we still had backprot being merged of this up until 2023-05-22 as distros back-ported the original ovs change into older release of ovs. This is the type of "nasty bugs" gibi was referring too. i for one wanted to only support one mode of operation per service binary per release but i do see value if for no other reason then debugging of being able to revert to the old behavior. the fact we had the vsctl driver made it very clear that this ovs bug was in the ovs lib or python bindings as we coudl revert to the other impleation and show it only happend in the native code path.
The new concurrency model in nova (native threading) needs different performance tuning than the previous (eventlet). The cost of having 1000 eventlets is negligible but having 1000 threads to replace that will blow up the memory usage of the service. Operators expressed that having such tuning effort happening during upgrade without a temporary way back to the old model is scary. And honestly I agree.
Similarly we expect nasty bugs in the new model as it is a significant architectural change. So having no way to go back to a known good state temporarily while the bug is fixed or worked around is scarry.
Third, if we want to keep green CI while we are transforming nova services to the new model without keeping a big feature branch then we need to be able to land code that passes CI while things are half transformed. The only way we can do that is if we support both concurrency modes in parallel for a while.
Cheers, gibi
Cheers,
Thomas Goirand (zigo)
Hey team, Just my two cents in this topic. As an operator, I am eager to get rid of eventlet everywhere. The current situation is painful, very hard to debug on daily basis. There is no day without hearing from the team: oh, you know, maybe it's an eventlet bug :) Cheers, Arnaud On 16.06.25 - 11:11, Balazs Gibizer wrote:
On Sat, Jun 14, 2025 at 1:24 AM <thomas@goirand.fr> wrote:
On Jun 13, 2025 20:52, Jay Faulkner <jay@gr-oss.io> wrote:
I'm confused a bit -- the implementation details of our threading modules are not a public API that we owe deprecation periods for. Why are we treating it as such?
-JayF
Right. Plus I don't get why operators get to choose what class of bugs they may experience, and how they will know beter than contributors.
The new concurrency model in nova (native threading) needs different performance tuning than the previous (eventlet). The cost of having 1000 eventlets is negligible but having 1000 threads to replace that will blow up the memory usage of the service. Operators expressed that having such tuning effort happening during upgrade without a temporary way back to the old model is scary. And honestly I agree.
Similarly we expect nasty bugs in the new model as it is a significant architectural change. So having no way to go back to a known good state temporarily while the bug is fixed or worked around is scarry.
Third, if we want to keep green CI while we are transforming nova services to the new model without keeping a big feature branch then we need to be able to land code that passes CI while things are half transformed. The only way we can do that is if we support both concurrency modes in parallel for a while.
Cheers, gibi
Cheers,
Thomas Goirand (zigo)
On Jun 13, 2025 14:09, Balazs Gibizer <gibi@redhat.com> wrote:
What do you think?
Cheers
gibi
Hi, Thanks for sharing your thoughts. Mostly: that as the Debian package maintainer of OpenStack, the current situation is VERY uncomfortable. :( It all depends on if we can get Eventlet fixed for Python 3.13. Worst case, if mobody finds a way to fix https://github.com/eventlet/eventlet/issues/1032 then from my perspecrive (ie: as the Debian package maintainer), Nova MUST be fixed for 2026.1. Indeed, otherwise Debian 12 will be EOL too early, putting Debian OpenStack operators at risk of unpatch security bugs (I do not want to rely too much on the Debian LTS effort). Do you think that's possible? If not, then some more effort should be spent on maintaining Eventlet and make it works on Python 3.13 (and 3.14). Cheers, Thomas Goirand (zigo)
(replying to the top of the thread for better visibility) Based on the discussions that occurred on the thread coming from the concern that Gibi raised, I proposed a governance patch for modifying the completion dates of the TC goal https://review.opendev.org/c/openstack/governance/+/952903/ Please comment heavily on that change proposal. Thanks, -Sylvain Le ven. 13 juin 2025 à 14:09, Balazs Gibizer <gibi@redhat.com> a écrit :
Hi Stackers!
I would like to sync about the planned timeline of dropping eventlet support from OpenStack / Oslo.
Nova definitely needs at least the full 2026.1 cycle to have a chance to transform the nova-compute service. But this plan already feels stretched based on the progress in the current cycle. So being conservative means we need the 2026.2 cycle as a buffer.
Nova would like to keep a release where we support both eventlet and threading in parallel. So that operators can do the switching from eventlet to threading outside of the upgrade procedure. (This was an explicit request from them during the PTG). So 2026.2 could be that version where nova fully supports both concurrency mode, while eventlet can be marked deprecated. Then the 2027.1 release could be the first release dropping eventlet.
However we need to align with the SLURP upgrade as well. 2026.1 is a SLURP. But in that release Nova might not be ready to have all services running in threading mode. So the 2026.1 - 2027.1 SLURP upgrade would force the operators to change the concurrency mode during the upgrade itself.
I see two ways forward: * A) We say that operators who want to do the concurrency mode change outside of an upgrade could not skip the 2026.2 release, i.e. they cannot do SLURP directly from 2026.1. to 2027.1. * B) We keep supporting the eventlet mode in the 2027.1 release as well and only dropping support in 2028.1.
What do you think?
Cheers gibi
participants (10)
-
Arnaud Morin
-
Balazs Gibizer
-
Dmitriy Rabotyagov
-
Ghanshyam Maan
-
Herve Beraud
-
Jay Faulkner
-
Sean Mooney
-
Sylvain Bauza
-
Thomas Goirand
-
thomas@goirand.fr