On a test cluster I tried to kill one of the sessions in "closing" status, that was immediately noticed by pacemaker which spawned a new process. So the closing status or the session timestamps are no indication of unused sessions. Still not sure if enable-linger or simply increasing the systemd-logind value for "SessionsMax" is the better solution. Zitat von Eugen Block <eblock@nde.ag>:
Hi *,
I can't seem to find much about rabbitmq and logind, so I wanted to ask the list if anyone has encountered the same and if so, how they dealt with it. We're supporting a Victoria cluster (installed with our own deployment method) mostly controlled by pacemaker. And on the rabbit master node I see this warning constantly:
---snip--- 2024-07-29T14:09:23.552576+02:00 control01 su: pam_unix(su:session): session opened for user rabbitmq by (uid=0) 2024-07-29T14:09:24.450657+02:00 control01 su: pam_unix(su:session): session closed for user rabbitmq 2024-07-29T14:09:24.500356+02:00 control01 su: (to rabbitmq) root on none 2024-07-29T14:09:24.502370+02:00 control01 su: pam_systemd(su:session): Failed to create session: Maximum number of sessions (8192) reached, refusing further sessions. 2024-07-29T14:09:24.502681+02:00 control01 su: pam_unix(su:session): session opened for user rabbitmq by (uid=0) 2024-07-29T14:09:25.565203+02:00 control01 su: pam_unix(su:session): session closed for user rabbitmq 2024-07-29T14:09:25.609613+02:00 control01 su: (to rabbitmq) root on none ---snip---
Looking into loginctl list-sessions, almost all of them belong to rabbitmq and they have a very old timestamp (2023). I'm aware of older systemd versions which can't handle closing sessions correctly [0], but we can't upgrade at this time. Would enabling "linger" fix this (loginctl enable-linger rabbitmq) after a reboot? During the next maintenance window I would reboot control01 and watch how the other control nodes behave wrt rabbit login-sessions. But I'm wondering if somebody already has dealt with this?
Thanks for any pointers! Eugen