[openstacksdk][keystoneauth1][nova] Memory leak in nova-scheduler, possibly all services?
Hello everyone, We've been seeing steady memory leaks in some of our OpenStack API services. I think I've found the cause but I'd appreciate some feedback. When a service creates a client object (e.g. placementclient) to use another service's API, the call goes through Proxy.request() in the openstacksdk module. That function calls _report_stats()[1] for every request which calls three other functions for statsd, prometheus and influxdb. The prometheus function records the number of requests and response times in a dict using the full URL of the request.[2] This data is recorded unless the _prometheus_counter or _prometheus_histogram member variables are None.[2] Those member variables are set by config.CloudRegion.get_prometheus_counter()[3] and config.CloudRegion.get_prometheus_histogram()[4], which do not check any config settings. It looks like the only way to prevent this behavior is to uninstall the prometheus_client module.[5] Unfortunately prometheus_client is required by oslo.metrics[6] which is required by oslo.messaging[7]. This is causing a lot of extra memory usage in our production environments. It's worst in our build farm where we create over 500K VMs per week. nova-scheduler queries a lot of unique URLs because the Placement API uses the VM's UUID to get allocations[8] and each URL gets its own counter and histogram. Even though we run 6 copies of nova-scheduler (currently Caracal) after a few weeks each copy will reach 8 GB and the worker processes will start getting OOMkilled, putting the new VM in error state. The nova-scheduler parent process keeps running and restarts the workers, which just keep getting killed and VMs can't be reliably scheduled until nova-scheduler is restarted. This feels like two problems to me. We don't need Prometheus metrics from the OpenStack services but there's no way to turn them off. The Session class in keystoneauth1 supports a collect-timing option[9] (default False) but it's only used in one or two places that I can find. The CloudRegion methods get their config from the Adapter[10] class in keystoneauth1, which does not support collect-timing. Should it? Even if we did use these Prometheus metrics, it seems like they shouldn't be allowed to grow unbounded forever. Should the Proxy._report_stats_prometheus() function[2] limit the metrics by age or quantity? Any thoughts? I've patched our services in-house to turn off these metrics (I'll have results in a week). In the meantime, are we doing something wrong or is every OpenStack service always collecting these metrics constantly? -- Sam Clippinger 1: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/proxy... 2: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/proxy... 3: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 4: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 5: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 6: https://github.com/openstack/oslo.metrics/blob/stable/2025.2/requirements.tx... 7: https://github.com/openstack/oslo.messaging/blob/stable/2025.2/requirements.... 8: https://docs.openstack.org/api-ref/placement/#list-allocations 9: https://github.com/openstack/keystoneauth/blob/stable/2025.2/keystoneauth1/l... 10: https://github.com/openstack/keystoneauth/blob/stable/2025.2/keystoneauth1/l... ________________________________ CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient(s) and contain information that may be Garmin confidential and/or Garmin legally privileged. If you have received this email in error, please notify the sender by reply email and delete the message. Any disclosure, copying, distribution or use of this communication (including attachments) by someone other than the intended recipient is prohibited. Thank you.
Sam, I'm no expert on this area, but I thought I'd give you some information, as we're also still running caracal in production, so it might be useful to know that your experience is not universal. I checked my environment, and while we're not creating anywhere near the same number of instances in that timeframe, our nova_scheduler memory usage is pretty steady for months straight. it might be something that is deployment dependent, specific to your deployment? In our case, we're using kolla-ansible, where the Prometheus services get their own containers, and nova-scheduler container doesn't appear to have any Prometheus services running. Following the case you've identified below, it sounds like you're looking at the nova-scheduler process calling an ever-increasing data set from the call: _report_stats()._report_stats_prometheus(response, url, method, exc) Presumably since your environment is adding 500,000 instances each week, these counters are always increasing (but also, as you mentioned, not necessary) and this is what you're attributing to your memory usage? In your deployment, on the controller nodes, are you enforcing any memory limits? (i.e. in older kolla-ansible deployments, you might have mem_limit configured as a holdout from Rocky, or you might have customised your container deployments to have limited access to memory. You can check on a controller by inspecting the container as a user of the docker group: docker inspect nova_scheduler | grep -i -C 4 mem If you are not running services in containers, or if you are not enforcing any container memory limits, are the controller nodes themselves running out of memory? What kind of operating system and memory configuration do your controller nodes have? Further, what exactly is the content of the responses you're seeing towards the late stage of nova_scheduler failing? Just as a thought, when we sized our environment, we consulted with a vendor who recommended 3 to 5 controllers would be suitable for 100k-200k instances, depending on workload. It could be that your workload exceeds the capability of the controllers you have? Kind Regards, Joel McLean - Micron21 Pty Ltd From: Clippinger, Sam <Sam.Clippinger@garmin.com> Sent: Saturday, 3 January 2026 3:18 AM To: openstack-discuss@lists.openstack.org Subject: [openstacksdk][keystoneauth1][nova] Memory leak in nova-scheduler, possibly all services? Hello everyone, We've been seeing steady memory leaks in some of our OpenStack API services. I think I've found the cause but I'd appreciate some feedback. When a service creates a client object (e.g. placementclient) to use another service's API, the call goes through Proxy.request() in the openstacksdk module. That function calls _report_stats()[1] for every request which calls three other functions for statsd, prometheus and influxdb. The prometheus function records the number of requests and response times in a dict using the full URL of the request.[2] This data is recorded unless the _prometheus_counter or _prometheus_histogram member variables are None.[2] Those member variables are set by config.CloudRegion.get_prometheus_counter()[3] and config.CloudRegion.get_prometheus_histogram()[4], which do not check any config settings. It looks like the only way to prevent this behavior is to uninstall the prometheus_client module.[5] Unfortunately prometheus_client is required by oslo.metrics[6] which is required by oslo.messaging[7]. This is causing a lot of extra memory usage in our production environments. It's worst in our build farm where we create over 500K VMs per week. nova-scheduler queries a lot of unique URLs because the Placement API uses the VM's UUID to get allocations[8] and each URL gets its own counter and histogram. Even though we run 6 copies of nova-scheduler (currently Caracal) after a few weeks each copy will reach 8 GB and the worker processes will start getting OOMkilled, putting the new VM in error state. The nova-scheduler parent process keeps running and restarts the workers, which just keep getting killed and VMs can't be reliably scheduled until nova-scheduler is restarted. This feels like two problems to me. We don't need Prometheus metrics from the OpenStack services but there's no way to turn them off. The Session class in keystoneauth1 supports a collect-timing option[9] (default False) but it's only used in one or two places that I can find. The CloudRegion methods get their config from the Adapter[10] class in keystoneauth1, which does not support collect-timing. Should it? Even if we did use these Prometheus metrics, it seems like they shouldn't be allowed to grow unbounded forever. Should the Proxy._report_stats_prometheus() function[2] limit the metrics by age or quantity? Any thoughts? I've patched our services in-house to turn off these metrics (I'll have results in a week). In the meantime, are we doing something wrong or is every OpenStack service always collecting these metrics constantly? -- Sam Clippinger 1: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/proxy... 2: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/proxy... 3: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 4: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 5: https://github.com/openstack/openstacksdk/blob/stable/2025.2/openstack/confi... 6: https://github.com/openstack/oslo.metrics/blob/stable/2025.2/requirements.tx... 7: https://github.com/openstack/oslo.messaging/blob/stable/2025.2/requirements.... 8: https://docs.openstack.org/api-ref/placement/#list-allocations 9: https://github.com/openstack/keystoneauth/blob/stable/2025.2/keystoneauth1/l... 10: https://github.com/openstack/keystoneauth/blob/stable/2025.2/keystoneauth1/l... ________________________________ CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient(s) and contain information that may be Garmin confidential and/or Garmin legally privileged. If you have received this email in error, please notify the sender by reply email and delete the message. Any disclosure, copying, distribution or use of this communication (including attachments) by someone other than the intended recipient is prohibited. Thank you.
Hi Sam, On Fri, 2026-01-02 at 16:17 +0000, Clippinger, Sam wrote:
When a service creates a client object (e.g. placementclient) to use another service's API, the call goes through Proxy.request() in the openstacksdk module. That function calls _report_stats()[1] for every request which calls three other functions for statsd, prometheus and influxdb. The prometheus function records the number of requests and response times in a dict using the full URL of the request.[2]
I pointed your message out to my colleagues yesterday, one of them, Callum Dickinson has dug into it. He reckons he's found the issue, and submitted a changeset to resolve it: https://review.opendev.org/c/openstack/openstacksdk/+/972237 Posting on Callum's behalf as he isn't subscribed. Cheers, Andrew -- Andrew Ruthven, Wellington, New Zealand andrew@etc.gen.nz | Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz |
Excellent, that pretty much looks like how I patched it internally too. I'll be able to tell by the end of the week if it fixed our issue; I'll post back here with my findings. -- Sam Clippinger From: Andrew Ruthven <andrew@etc.gen.nz> Date: Monday, January 5, 2026 at 2:25 PM To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [openstacksdk][keystoneauth1][nova] Memory leak in nova-scheduler, possibly all services? This Message Is From an Unfamiliar Sender To protect from pretexting scams, Proofpoint requires consistent email exchanges before recognizing the sender as trusted. Report Suspicious<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/EJc4YC3iFmQ!9e-9_fahg7k95BW936yQNS_z6ANKkZwCqc8V-_7VlpAHWtucz34EQVukC-61zQPTOA1F57J7rep8ZsIC4Y023rJOdrIHkLDexHIsagFhYnfD4nfGk6o$> Hi Sam, On Fri, 2026-01-02 at 16:17 +0000, Clippinger, Sam wrote: When a service creates a client object (e.g. placementclient) to use another service's API, the call goes through Proxy.request() in the openstacksdk module. That function calls _report_stats()[1] for every request which calls three other functions for statsd, prometheus and influxdb. The prometheus function records the number of requests and response times in a dict using the full URL of the request.[2] I pointed your message out to my colleagues yesterday, one of them, Callum Dickinson has dug into it. He reckons he's found the issue, and submitted a changeset to resolve it: https://review.opendev.org/c/openstack/openstacksdk/+/972237<https://urldefense.com/v3/__https://review.opendev.org/c/openstack/openstacksdk/*/972237__;Kw!!EJc4YC3iFmQ!UEU85wPfxy5O_rXGASHgBgE5cYYowzl6mVEmnWomjNmwYuZL_Ctlvxc08LczYUgSs5DIX6FXnLjsdbQFnhV0Dw$> Posting on Callum's behalf as he isn't subscribed. Cheers, Andrew -- Andrew Ruthven, Wellington, New Zealand andrew@etc.gen.nz | Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz<https://urldefense.com/v3/__https://catalystcloud.nz__;!!EJc4YC3iFmQ!UEU85wPfxy5O_rXGASHgBgE5cYYowzl6mVEmnWomjNmwYuZL_Ctlvxc08LczYUgSs5DIX6FXnLjsdbT9a9PwxA$> | ________________________________ CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient(s) and contain information that may be Garmin confidential and/or Garmin legally privileged. If you have received this email in error, please notify the sender by reply email and delete the message. Any disclosure, copying, distribution or use of this communication (including attachments) by someone other than the intended recipient is prohibited. Thank you.
participants (4)
-
Andrew Ruthven
-
Clippinger, Sam
-
Joel McLean
-
Mike Lowe