[magnum][api] Error system library fopen too many open files with magnum-auto-healer

Erik Olof Gunnar Andersson eandersson at blizzard.com
Tue Jan 5 09:22:57 UTC 2021


Yea - tested locally as well and wasn't able to reproduce it either. I changed the health service job to run every second and maxed out at about 42 connections to RabbitMQ with two conductor workers.

/etc/magnum/magnun.conf

[conductor]
workers = 2


________________________________
From: Spyros Trigazis <strigazi at gmail.com>
Sent: Tuesday, January 5, 2021 12:59 AM
To: Ionut Biru <ionut at fleio.com>
Cc: Erik Olof Gunnar Andersson <eandersson at blizzard.com>; feilong <feilong at catalyst.net.nz>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [magnum][api] Error system library fopen too many open files with magnum-auto-healer



On Tue, Jan 5, 2021 at 9:36 AM Ionut Biru <ionut at fleio.com<mailto:ionut at fleio.com>> wrote:
Hi,

I tried with process=1 and it reached 1016 connections to rabbitmq.
lsof
https://paste.xinu.at/jGg/<https://urldefense.com/v3/__https://paste.xinu.at/jGg/__;!!Ci6f514n9QsL8ck!w-sy8zu-TkPMcmlD3ZhyxEiBTRWikibrBZOfumXkqKodtdcI4FD236uNMmjynMvIcA$>

i think it goes into error when it reaches 1024 file descriptors.

I'm out of ideas of how to resolve this. I only have 3 clusters available and it's kinda weird and It doesn't scale.

No issues here with 100s of clusters. Not sure what doesn't scale.

* Maybe your rabbit is flooded with notifications that are not consumed?
* You can use way more than 1024 file descriptors, maybe 2^10?

Spyros

On Mon, Jan 4, 2021 at 9:53 PM Erik Olof Gunnar Andersson <eandersson at blizzard.com<mailto:eandersson at blizzard.com>> wrote:

Sure looks like RabbitMQ. How many workers do have you configured?



Could you try to changing the uwsgi configuration to workers=1 (or processes=1) and then see if it goes beyond 30 connections to amqp.



From: Ionut Biru <ionut at fleio.com<mailto:ionut at fleio.com>>
Sent: Monday, January 4, 2021 4:07 AM
To: Erik Olof Gunnar Andersson <eandersson at blizzard.com<mailto:eandersson at blizzard.com>>
Cc: feilong <feilong at catalyst.net.nz<mailto:feilong at catalyst.net.nz>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [magnum][api] Error system library fopen too many open files with magnum-auto-healer



Hi Erik,



Here is lsof of one uwsgi api. https://paste.xinu.at/5YUWf/<https://urldefense.com/v3/__https:/paste.xinu.at/5YUWf/__;!!Ci6f514n9QsL8ck!wv_wzG-Ntk0gd3ReOupQl-iXIcWpPR3genCqeKNY5JCKZDWxQHSqqa-uxxgUFFhz0Q$>



I have kubernetes 12.0.1 installed in env.





On Sun, Jan 3, 2021 at 3:06 AM Erik Olof Gunnar Andersson <eandersson at blizzard.com<mailto:eandersson at blizzard.com>> wrote:

Maybe something similar to this?
https://github.com/kubernetes-client/python/issues/1158<https://urldefense.com/v3/__https:/github.com/kubernetes-client/python/issues/1158__;!!Ci6f514n9QsL8ck!wv_wzG-Ntk0gd3ReOupQl-iXIcWpPR3genCqeKNY5JCKZDWxQHSqqa-uxxgAtzJkNg$>

What does lsof say?






--
Ionut Biru - https://fleio.com<https://urldefense.com/v3/__https://fleio.com__;!!Ci6f514n9QsL8ck!w-sy8zu-TkPMcmlD3ZhyxEiBTRWikibrBZOfumXkqKodtdcI4FD236uNMmit-G0eng$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210105/5ab971b5/attachment.html>


More information about the openstack-discuss mailing list