OpenStack Ansible - Telemetry
Dear All, I would like to know which monitoring solution is currently supported by OSA? We are operating a small cloud (~ 6 nodes) and we are interested in collecting performance metrics, events and logs. So, as far as I know, the official OSA solution is ceilometer/aodh/panko with Gnocchi as DB backend. However Gnocchi project seems abandoned at the moment and the grafana plugin is not compatible with latest Grafana. Then there is solution based on collectd with this plugin (https://github.com/signalfx/collectd-openstack <https://github.com/signalfx/collectd-openstack>) with Graphite or InfluxDB as backend. This supports only performance metrics and not the events. Then there are also some Prometheus exporters available, again, metrics only. What do you guys use these days? What would you recommend? Thanks! Best regards, Martin
On 07/01/2021 13:20, Golasowski Martin wrote:
Dear All, I would like to know which monitoring solution is currently supported by OSA? We are operating a small cloud (~ 6 nodes) and we are interested in collecting performance metrics, events and logs.
So, as far as I know, the official OSA solution is ceilometer/aodh/panko with Gnocchi as DB backend. However Gnocchi project seems abandoned at the moment and the grafana plugin is not compatible with latest Grafana.
Then there is solution based on collectd with this plugin (https://github.com/signalfx/collectd-openstack <https://github.com/signalfx/collectd-openstack>) with Graphite or InfluxDB as backend. This supports only performance metrics and not the events.
Then there are also some Prometheus exporters available, again, metrics only.
What do you guys use these days? What would you recommend?
Hi there, with my telemetry hat on: we're working on the gnocchi issue, but gnocchi is only a metrics store anyways. Personally, I wouldn't want to store any events in panko. If you use such things like autoscaling for instances, you definitely want gnocchi and aodh. With my collectd hat on: collectd supports collecting and sending metrics and events to multiple write endpoints. It is not designed to collect additional metadata, such as project or user data. You'll mostly get infrastructure related data (from baremetal nodes). The con side with using graphite or influxdb in the Open Source variant is, that you don't get HA. There is the Service Telemetry Framework[1], but it is integrated with TripleO, not with OSA, it uses both collectd and ceilometer for collection; metrics are stored in prometheus, events in elasticsearch, which is also used for log aggregation. I am unsure if this solution is not a bit too heavy for your use case. The best interest (in this community here): put some man-power on gnocchi. Matthias [1] https://infrawatch.github.io/documentation/
Thanks! So, in that case, “builtin” ceilometer with gnocchi is the way to go. In fact, it works when deployed with OSA, the only problem is incompatible Grafana plugin. Would you recommend some other tool to visualise gnocchi metrics? We can always downgrade Grafana to the last version which was compatible with the plugin, but that may break other telemetry. Regards, Martin
On 7. 1. 2021, at 19:57, Matthias Runge <mrunge@matthias-runge.de> wrote:
On 07/01/2021 13:20, Golasowski Martin wrote:
Dear All, I would like to know which monitoring solution is currently supported by OSA? We are operating a small cloud (~ 6 nodes) and we are interested in collecting performance metrics, events and logs. So, as far as I know, the official OSA solution is ceilometer/aodh/panko with Gnocchi as DB backend. However Gnocchi project seems abandoned at the moment and the grafana plugin is not compatible with latest Grafana. Then there is solution based on collectd with this plugin (https://github.com/signalfx/collectd-openstack <https://github.com/signalfx/collectd-openstack>) with Graphite or InfluxDB as backend. This supports only performance metrics and not the events. Then there are also some Prometheus exporters available, again, metrics only. What do you guys use these days? What would you recommend?
Hi there,
with my telemetry hat on: we're working on the gnocchi issue, but gnocchi is only a metrics store anyways. Personally, I wouldn't want to store any events in panko. If you use such things like autoscaling for instances, you definitely want gnocchi and aodh.
With my collectd hat on: collectd supports collecting and sending metrics and events to multiple write endpoints. It is not designed to collect additional metadata, such as project or user data. You'll mostly get infrastructure related data (from baremetal nodes). The con side with using graphite or influxdb in the Open Source variant is, that you don't get HA.
There is the Service Telemetry Framework[1], but it is integrated with TripleO, not with OSA, it uses both collectd and ceilometer for collection; metrics are stored in prometheus, events in elasticsearch, which is also used for log aggregation. I am unsure if this solution is not a bit too heavy for your use case.
The best interest (in this community here): put some man-power on gnocchi.
Matthias
participants (2)
-
Golasowski Martin
-
Matthias Runge