Hello Uday, Thank you very much for your reply and inputs, I will look into it. I might have given too few informations regarding our setup, as we went with RedHat for this implementation of RHOSP, everything were deployed with TripleO and is containerised inside podman. Our control nodes are running pacemaker for bundling galera, rabbitmq and haproxy. If anyone would like to share what they are using and how, with similar setup or not, feel free to do so. Best regards, [cid:eb983002-0b93-44d3-bacf-3e23eec91b7d] Timothé BAUGÉ timothe.bauge@covage.com<mailto:pr%C3%A9nom.nom@covage.com> COVAGE I Wholesale B2B du Groupe Altitude ________________________________ De : Uday Dikshit <udaydikshit2007@gmail.com> Envoyé : mardi 10 décembre 2024 20:03 À : Timothé Bauge <Timothe.Bauge@covage.com> Cc : openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Objet : Re: OpenStack metrics and logs Vous n’obtenez pas souvent d’e-mail à partir de udaydikshit2007@gmail.com. Pourquoi c’est important<https://aka.ms/LearnAboutSenderIdentification> Hello Timothé Since you are already using Grafana, you can add Openstack's mariadb Database connection as a data source which will help you collect basic information on the capacity of your cloud ecosystem. Another add on you might like to have is benchmarking your openstack services as a part of regular health checks. For this Rally Openstack project can easily be utilised, it will help you capture service uptime and response time. I hope this information might help you with your use case. On Tue, Dec 10, 2024, 19:03 Timothé Bauge <Timothe.Bauge@covage.com<mailto:Timothe.Bauge@covage.com>> wrote: Hello Stackers, I would like to know what are you all using to monitor and supervise your Openstack clusters ? We are in the process of setting up our own private cloud based on RedHat OpenStack Platform 17.1, we choose not to go with the RedHat Service Telemetry Platform, and were strongly advise against Ceilometer and aodh. At the moment, we built our own stack based on Prometheus (with the plenty of exporters) for the metrics, Graylog + OpenSearch for the logs, and Grafana for the visualisation. For now, we are only looking to retrieve basic information in order to know if a dysfunction occurs, but the end goal might be to go as far as to be able to count the cpu/mem/disk usage per hour per vm per project, etc. So, what are you all using and how did you implement it ? Best regards, [cid:659cb34d-63ad-4c5a-94e4-734ef252c20b] Timothé BAUGÉ timothe.bauge@covage.com<mailto:pr%C3%A9nom.nom@covage.com> COVAGE I Wholesale B2B du Groupe Altitude