<div dir="ltr"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Hello lists,<br><br>With heat's team help I figured it out. Thanks Jay for looking into it.</span><br style="text-decoration-style:initial;text-decoration-color:initial"><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">The issue is coming from [1], where the max_overflow is set to</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> executor_thread_pool_size if it is set to a lower value to address</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">another issue. In my case, I had a lot of RAM and CPU so I could</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">push for threads but I was "short" in db connections. The formula to</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">calculate the number of connections can be like this:</span><br style="text-decoration-style:initial;text-decoration-color:initial"><div style="text-decoration-style:initial;text-decoration-color:initial">num_heat_hosts=4</div><div style="text-decoration-style:initial;text-decoration-color:initial">heat_api_workers=2</div><div style="text-decoration-style:initial;text-decoration-color:initial">heat_api_cfn_workers=2</div><div style="text-decoration-style:initial;text-decoration-color:initial">num_engine_workers=4<br><span style="background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">executor_thread_pool_size = 22</span><br></div><div style="text-decoration-style:initial;text-decoration-color:initial">max_pool_size=4</div><div style="text-decoration-style:initial;text-decoration-color:initial">max_overflow=<span style="background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">executor_thread_pool_size<br>num_heat_hosts * (max_pool_size + max_overflow) * (heat_api_workers + num_engine_workers + heat_api_cfn_workers) <br></span></div><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">832</span><br style="text-decoration-style:initial;text-decoration-color:initial"><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">And a note for magnum deployments medium to large, see the options</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">we have changed in heat conf and change according to your needs.</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">The db configuration described here and changes we discovered in a</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">previous scale test can help to have a stable magnum and heat service.</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> </span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">For large stacks or projects with many stacks you need to change</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">the following in these values or better, according to your needs.</span><br style="text-decoration-style:initial;text-decoration-color:initial"><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">[Default]</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">executor_thread_pool_size = 22</span><br style="text-decoration-style:initial;text-decoration-color:initial"><div style="text-decoration-style:initial;text-decoration-color:initial">max_resources_per_stack = -1<br>max_stacks_per_tenant = 10000<br>action_retry_limit = 10<br>client_retry_limit = 10<br>engine_life_check_timeout = 600<br>max_template_size = 5242880<br>rpc_poll_timeout = 600<br>rpc_response_timeout = 600<br>num_engine_workers = 4<br><br>[database]<br>max_pool_size = 4<br>max_overflow = 22<br></div><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Cheers,</span><br style="text-decoration-style:initial;text-decoration-color:initial"><span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Spyros</span><br style="text-decoration-style:initial;text-decoration-color:initial"><br style="text-decoration-style:initial;text-decoration-color:initial"><div style="text-decoration-style:initial;text-decoration-color:initial">[heat_api] </div><div style="text-decoration-style:initial;text-decoration-color:initial">workers = 2<br><br>[heat_api_cfn]<br>workers = 2<br><br>Cheers,<br>Spyros<br><br>ps We will update the magnum docs as well<br><br>[1] <a href="http://git.openstack.org/cgit/openstack/heat/tree/heat/engine/service.py#n375">http://git.openstack.org/cgit/openstack/heat/tree/heat/engine/service.py#n375</a></div><br><br><div class="gmail_quote"><div dir="ltr">On Mon, 18 Jun 2018 at 19:39, Jay Pipes <<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">+openstack-dev since I believe this is an issue with the Heat source code.<br>
<br>
On 06/18/2018 11:19 AM, Spyros Trigazis wrote:<br>
> Hello list,<br>
> <br>
> I'm hitting quite easily this [1] exception with heat. The db server is <br>
> configured to have 1000<br>
> max_connnections and 1000 max_user_connections and in the database <br>
> section of heat<br>
> conf I have these values set:<br>
> max_pool_size = 22<br>
> max_overflow = 0<br>
> Full config attached.<br>
> <br>
> I ended up with this configuration based on this formula:<br>
> num_heat_hosts=4<br>
> heat_api_workers=2<br>
> heat_api_cfn_workers=2<br>
> num_engine_workers=4<br>
> max_pool_size=22<br>
> max_overflow=0<br>
> num_heat_hosts * (max_pool_size + max_overflow) * (heat_api_workers + <br>
> num_engine_workers + heat_api_cfn_workers)<br>
> 704<br>
> <br>
> What I have noticed is that the number of connections I expected with <br>
> the above formula is not respected.<br>
> Based on this formula each node (every node runs the heat-api, <br>
> heat-api-cfn and heat-engine) should<br>
> use up to 176 connections but they even reach 400 connections.<br>
> <br>
> Has anyone noticed a similar behavior?<br>
<br>
Looking through the Heat code, I see that there are many methods in the <br>
/heat/db/sqlalchemy/api.py module that use a SQLAlchemy session but <br>
never actually call session.close() [1] which means that the session <br>
will not be released back to the connection pool, which might be the <br>
reason why connections keep piling up.<br>
<br>
Not sure if there's any setting in Heat that will fix this problem. <br>
Disabling connection pooling will likely not help since connections are <br>
not properly being closed and returned to the connection pool to begin with.<br>
<br>
Best,<br>
-jay<br>
<br>
[1] Heat apparently doesn't use the oslo.db enginefacade transaction <br>
context managers either, which would help with this problem since the <br>
transaction context manager would take responsibility for calling <br>
session.flush()/close() appropriately.<br>
<br>
<a href="https://github.com/openstack/oslo.db/blob/43af1cf08372006aa46d836ec45482dd4b5b5349/oslo_db/sqlalchemy/enginefacade.py#L626" rel="noreferrer" target="_blank">https://github.com/openstack/oslo.db/blob/43af1cf08372006aa46d836ec45482dd4b5b5349/oslo_db/sqlalchemy/enginefacade.py#L626</a><br>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</blockquote></div></div>