[magnum][api] Error system library fopen too many open files with magnum-auto-healer

Ionut Biru ionut at fleio.com
Tue Jan 5 16:36:28 UTC 2021


Hi,

I found this story: https://storyboard.openstack.org/#!/story/2008308
regarding disabling cluster update notifications in rabbitmq.

I think this will help me.

On Tue, Jan 5, 2021 at 12:21 PM Erik Olof Gunnar Andersson <
eandersson at blizzard.com> wrote:

> Sorry, being repetitive here, but maybe try adding this to your magnum
> config as well. If you have A LOT of cores it could add up to a crazy
> amount of connections.
>
> [conductor]
> workers = 2
>
> ------------------------------
> *From:* Ionut Biru <ionut at fleio.com>
> *Sent:* Tuesday, January 5, 2021 1:50 AM
> *To:* Erik Olof Gunnar Andersson <eandersson at blizzard.com>
> *Cc:* Spyros Trigazis <strigazi at gmail.com>; feilong <
> feilong at catalyst.net.nz>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [magnum][api] Error system library fopen too many open
> files with magnum-auto-healer
>
> Hi,
>
> Here is my config. maybe something is fishy.
>
> I did have around 300 messages in the queue in notification.info
> <https://urldefense.com/v3/__http://notification.info__;!!Ci6f514n9QsL8ck!zXau4TQ7lpYxxCmShvD-QtwfISyXyajq11TeBMle6hAdw3N9NdP7PuG5YgqgOhdO4A$>
> and notification.err and I purged them.
>
> https://paste.xinu.at/woMt/
> <https://urldefense.com/v3/__https://paste.xinu.at/woMt/__;!!Ci6f514n9QsL8ck!zXau4TQ7lpYxxCmShvD-QtwfISyXyajq11TeBMle6hAdw3N9NdP7PuG5YgrG1_F7_w$>
>
>
>
> On Tue, Jan 5, 2021 at 11:23 AM Erik Olof Gunnar Andersson <
> eandersson at blizzard.com> wrote:
>
> Yea - tested locally as well and wasn't able to reproduce it either. I
> changed the health service job to run every second and maxed out at about
> 42 connections to RabbitMQ with two conductor workers.
>
> /etc/magnum/magnun.conf
>
> [conductor]
> workers = 2
>
>
> ------------------------------
> *From:* Spyros Trigazis <strigazi at gmail.com>
> *Sent:* Tuesday, January 5, 2021 12:59 AM
> *To:* Ionut Biru <ionut at fleio.com>
> *Cc:* Erik Olof Gunnar Andersson <eandersson at blizzard.com>; feilong <
> feilong at catalyst.net.nz>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [magnum][api] Error system library fopen too many open
> files with magnum-auto-healer
>
>
>
> On Tue, Jan 5, 2021 at 9:36 AM Ionut Biru <ionut at fleio.com> wrote:
>
> Hi,
>
> I tried with process=1 and it reached 1016 connections to rabbitmq.
> lsof
> https://paste.xinu.at/jGg/
> <https://urldefense.com/v3/__https://paste.xinu.at/jGg/__;!!Ci6f514n9QsL8ck!w-sy8zu-TkPMcmlD3ZhyxEiBTRWikibrBZOfumXkqKodtdcI4FD236uNMmjynMvIcA$>
>
> i think it goes into error when it reaches 1024 file descriptors.
>
> I'm out of ideas of how to resolve this. I only have 3 clusters available
> and it's kinda weird and It doesn't scale.
>
>
> No issues here with 100s of clusters. Not sure what doesn't scale.
>
> * Maybe your rabbit is flooded with notifications that are not consumed?
> * You can use way more than 1024 file descriptors, maybe 2^10?
>
> Spyros
>
>
> On Mon, Jan 4, 2021 at 9:53 PM Erik Olof Gunnar Andersson <
> eandersson at blizzard.com> wrote:
>
> Sure looks like RabbitMQ. How many workers do have you configured?
>
>
>
> Could you try to changing the uwsgi configuration to workers=1 (or
> processes=1) and then see if it goes beyond 30 connections to amqp.
>
>
>
> *From:* Ionut Biru <ionut at fleio.com>
> *Sent:* Monday, January 4, 2021 4:07 AM
> *To:* Erik Olof Gunnar Andersson <eandersson at blizzard.com>
> *Cc:* feilong <feilong at catalyst.net.nz>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [magnum][api] Error system library fopen too many open
> files with magnum-auto-healer
>
>
>
> Hi Erik,
>
>
>
> Here is lsof of one uwsgi api. https://paste.xinu.at/5YUWf/
> <https://urldefense.com/v3/__https:/paste.xinu.at/5YUWf/__;!!Ci6f514n9QsL8ck!wv_wzG-Ntk0gd3ReOupQl-iXIcWpPR3genCqeKNY5JCKZDWxQHSqqa-uxxgUFFhz0Q$>
>
>
>
> I have kubernetes 12.0.1 installed in env.
>
>
>
>
>
> On Sun, Jan 3, 2021 at 3:06 AM Erik Olof Gunnar Andersson <
> eandersson at blizzard.com> wrote:
>
> Maybe something similar to this?
> https://github.com/kubernetes-client/python/issues/1158
> <https://urldefense.com/v3/__https:/github.com/kubernetes-client/python/issues/1158__;!!Ci6f514n9QsL8ck!wv_wzG-Ntk0gd3ReOupQl-iXIcWpPR3genCqeKNY5JCKZDWxQHSqqa-uxxgAtzJkNg$>
>
> What does lsof say?
>
>
>
>
>
>
>
> --
> Ionut Biru - https://fleio.com
> <https://urldefense.com/v3/__https://fleio.com__;!!Ci6f514n9QsL8ck!w-sy8zu-TkPMcmlD3ZhyxEiBTRWikibrBZOfumXkqKodtdcI4FD236uNMmit-G0eng$>
>
>
>
> --
> Ionut Biru - https://fleio.com
> <https://urldefense.com/v3/__https://fleio.com__;!!Ci6f514n9QsL8ck!zXau4TQ7lpYxxCmShvD-QtwfISyXyajq11TeBMle6hAdw3N9NdP7PuG5Ygp-5WUmyw$>
>


-- 
Ionut Biru - https://fleio.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210105/985d1559/attachment.html>


More information about the openstack-discuss mailing list