[telemetry] wallaby cycle planning session
Hi there, one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...? Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1]. There has also been an attempt to fork gnocchi back to OpenStack[2]. To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine. However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call. Please select time(s) that suit you best in this poll[3]. If you have questions or hints, don't hesitate to contact me. Thank you, Matthias [1] https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ [2] https://review.opendev.org/#/c/744592/ [3] https://doodle.com/poll/uqq328x5shr43awy
I answered your doodle poll there. I think that we can work together to find a solution for this unfortunate situation. Keep me updated with the results of the poll. On Thu, Nov 12, 2020 at 7:29 AM Matthias Runge <mrunge@matthias-runge.de> wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1].
There has also been an attempt to fork gnocchi back to OpenStack[2].
To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine.
However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call.
Please select time(s) that suit you best in this poll[3].
If you have questions or hints, don't hesitate to contact me.
Thank you, Matthias
[1] https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ [2] https://review.opendev.org/#/c/744592/ [3] https://doodle.com/poll/uqq328x5shr43awy
-- Rafael Weingärtner
Good morning, quick reminder, if you are interested in telemetry and its future or fate, please participate in the doodle[3]. Thank you. On 12/11/2020 11:25, Matthias Runge wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1].
There has also been an attempt to fork gnocchi back to OpenStack[2].
To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine.
However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call.
Please select time(s) that suit you best in this poll[3].
If you have questions or hints, don't hesitate to contact me.
Thank you, Matthias
[1] https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ [2] https://review.opendev.org/#/c/744592/ [3] https://doodle.com/poll/uqq328x5shr43awy
On 17/11/2020 08:22, Matthias Runge wrote:
Good morning,
quick reminder, if you are interested in telemetry and its future or fate, please participate in the doodle[3].
Thank you.
Hi, as result, we will be meeting next Friday (Nov. 27 at 4PM CET, that is 10 AM US Eastern). If it works for you, let's use this bluejeans channel[4] for the discussion. For ideas, questions and an agenda, there is the etherpad[5]. See you there [4] https://redhat.bluejeans.com/u/mrunge/ [5] https://etherpad.opendev.org/p/telemetry-wallaby-topics
On 12/11/2020 11:25, Matthias Runge wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1].
There has also been an attempt to fork gnocchi back to OpenStack[2].
To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine.
However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call.
Please select time(s) that suit you best in this poll[3].
If you have questions or hints, don't hesitate to contact me.
Thank you, Matthias
[1] https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ [2] https://review.opendev.org/#/c/744592/ [3] https://doodle.com/poll/uqq328x5shr43awy
On 11/12/20 11:25 AM, Matthias Runge wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1].
There has also been an attempt to fork gnocchi back to OpenStack[2].
To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine.
However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call.
Matthias, I'm not sure I will have time to join the call. So hopefully, you understand that I prefer email (also because I wont be able to contribute to the project, so maybe joining the call would be overkill). Could you list the alternatives? If not using Gnocchi, what other backend would you use? As an operator I need a timeseries which is: - free software - HA - able to scale - packaged or packagable in my distro I don't know any time series db (apart from Gnocchi) that check all the bullets above. Do you? On 11/20/20 3:48 PM, Matthias Runge wrote:
If it works for you, let's use this bluejeans channel[4] for the discussion.
Gosh, reading that you'd be using bluejeans, then definitively, I will not be able to join the call... :/ Cheers, Thomas Goirand (zigo)
On 2020-11-21 02:38:54 +0100 (+0100), Thomas Goirand wrote: [...]
Gosh, reading that you'd be using bluejeans, then definitively, I will not be able to join the call... :/
If it helps, I've been able to join Bluejeans calls with just Firefox (the version in debian/unstable at least), no proprietary browser extension or separate client needed. It may not be ideal, but it's worked for me in the past anyway. -- Jeremy Stanley
On 21/11/2020 15:26, Jeremy Stanley wrote:
On 2020-11-21 02:38:54 +0100 (+0100), Thomas Goirand wrote: [...]
Gosh, reading that you'd be using bluejeans, then definitively, I will not be able to join the call... :/
If it helps, I've been able to join Bluejeans calls with just Firefox (the version in debian/unstable at least), no proprietary browser extension or separate client needed. It may not be ideal, but it's worked for me in the past anyway.
Thanks again for raising this. Tbh, I was expecting to hear exactly that concern from you, but I couldn't find a "better" solution for now. While I know bluejeans itself is not open source, it's a hosted solution. But it doesn't require you to install any additional software or an "app". It worked with old firefox versions and it does with newer as well. You can also dial in with local numbers, if you don't want or can not use a computer. A real alternative may be big blue button, or senfcall.de; I haven't tested that in world-wide communications so far; my personal experience with jitsi calls from the same area was that it scaled up to a few participants. Currently, I don't know how many people will join the call. On a side note, is anyone aware of a service like "doodle", but without privacy issues? (I was expecting protest about using doodle as well). Matthias
On 2020-11-23 09:07:51 +0100 (+0100), Matthias Runge wrote: [...]
On a side note, is anyone aware of a service like "doodle", but without privacy issues? (I was expecting protest about using doodle as well).
If you're looking for a general survey tool, Limesurvey is entirely free/libre open source software, we've got a proof of concept for it running with some minimal survey admin docs here if you want to try setting up a survey: https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-surv... Depending on what you used Doodle for though, something as simple as https://framadate.org/ (also open source but you can choose to use their free hosted version) can work rather well. -- Jeremy Stanley
On 23 Nov 2020, at 15:34, Jeremy Stanley <fungi@yuggoth.org> wrote:
On 2020-11-23 09:07:51 +0100 (+0100), Matthias Runge wrote: [...]
On a side note, is anyone aware of a service like "doodle", but without privacy issues? (I was expecting protest about using doodle as well).
If you're looking for a general survey tool, Limesurvey is entirely free/libre open source software, we've got a proof of concept for it running with some minimal survey admin docs here if you want to try setting up a survey:
https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-surv...
Depending on what you used Doodle for though, something as simple as https://framadate.org/ (also open source but you can choose to use their free hosted version) can work rather well.
CERN has also a tool called newdle (https://github.com/indico/newdle <https://github.com/indico/newdle>) if you want to run something on-premise. I understand it can be run standalone or as part of the Indico conferencing system (used at CERN, UN and many other places - https://github.com/indico <https://github.com/indico>) Tim
-- Jeremy Stanley
On 23/11/2020 16:14, Tim Bell wrote:
On 23 Nov 2020, at 15:34, Jeremy Stanley <fungi@yuggoth.org <mailto:fungi@yuggoth.org>> wrote:
On 2020-11-23 09:07:51 +0100 (+0100), Matthias Runge wrote: [...]
On a side note, is anyone aware of a service like "doodle", but without privacy issues? (I was expecting protest about using doodle as well).
If you're looking for a general survey tool, Limesurvey is entirely free/libre open source software, we've got a proof of concept for it running with some minimal survey admin docs here if you want to try setting up a survey:
https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-surv... <https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-survey-user>
Depending on what you used Doodle for though, something as simple as https://framadate.org/ (also open source but you can choose to use their free hosted version) can work rather well.
CERN has also a tool called newdle (https://github.com/indico/newdle <https://github.com/indico/newdle>) if you want to run something on-premise. I understand it can be run standalone or as part of the Indico conferencing system (used at CERN, UN and many other places - https://github.com/indico <https://github.com/indico>)
Thank you both, this is good to know, and I'll definitely keep them in mind next time. The tool I found (a students re-implentation of the scheduling version from doodle) had issues with multiple time zones. Matthias
Hello guys, I got a bit lost throughout the thread. Do we already have a meeting place (Zoom, meet, or some other method/place)? On Mon, Nov 23, 2020 at 12:38 PM Matthias Runge <mrunge@matthias-runge.de> wrote:
On 23/11/2020 16:14, Tim Bell wrote:
On 23 Nov 2020, at 15:34, Jeremy Stanley <fungi@yuggoth.org <mailto:fungi@yuggoth.org>> wrote:
On 2020-11-23 09:07:51 +0100 (+0100), Matthias Runge wrote: [...]
On a side note, is anyone aware of a service like "doodle", but without privacy issues? (I was expecting protest about using doodle as well).
If you're looking for a general survey tool, Limesurvey is entirely free/libre open source software, we've got a proof of concept for it running with some minimal survey admin docs here if you want to try setting up a survey:
https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-surv...
< https://docs.opendev.org/opendev/system-config/latest/survey.html#admin-surv...
Depending on what you used Doodle for though, something as simple as https://framadate.org/ (also open source but you can choose to use their free hosted version) can work rather well.
CERN has also a tool called newdle (https://github.com/indico/newdle <https://github.com/indico/newdle>) if you want to run something on-premise. I understand it can be run standalone or as part of the Indico conferencing system (used at CERN, UN and many other places - https://github.com/indico <https://github.com/indico>)
Thank you both, this is good to know, and I'll definitely keep them in mind next time.
The tool I found (a students re-implentation of the scheduling version from doodle) had issues with multiple time zones.
Matthias
-- Rafael Weingärtner
On 23/11/2020 17:48, Rafael Weingärtner wrote:
Hello guys, I got a bit lost throughout the thread. Do we already have a meeting place (Zoom, meet, or some other method/place)?
Yes, we do http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018937....: --------------- Hi, as result, we will be meeting next Friday (Nov. 27 at 4PM CET, that is 10 AM US Eastern). If it works for you, let's use this bluejeans channel[4] for the discussion. For ideas, questions and an agenda, there is the etherpad[5]. See you there [4] https://redhat.bluejeans.com/u/mrunge/ [5] https://etherpad.opendev.org/p/telemetry-wallaby-topics -------------------
On 21/11/2020 02:38, Thomas Goirand wrote:
On 11/12/20 11:25 AM, Matthias Runge wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Telemetry started long time ago as a simple component ceilometer, which was split into several components ceilometer, aodh, panko, and gnocchi. Julien wrote a story about this some time ago[1].
There has also been an attempt to fork gnocchi back to OpenStack[2].
To my knowledge, the original contributors are not paid anymore to work on gnocchi, and at the same time, moving on to do something else is totally fine.
However, I am not sure if we (in OpenStack Telemetry) should or could maintain in addition to the rest of the telemetry stack a time-series database. I'd like to discuss this during a call.
Matthias,
I'm not sure I will have time to join the call. So hopefully, you understand that I prefer email (also because I wont be able to contribute to the project, so maybe joining the call would be overkill). Could you list the alternatives? If not using Gnocchi, what other backend would you use? As an operator I need a timeseries which is: - free software - HA - able to scale - packaged or packagable in my distro
Thomas, thank you for your email. Let me go into more details: since this requires some discussion, I am proposing to have a call instead of an email thread. The current stack uses gnocchi; you may be well aware of the back and forth around it. Also, I am not sure how much effort we can invest, or if that's feasible. I totally agree, we'd want a free software (not some open core or so) and also a scalable solution (you may have heard some load issues with old ceilometer or gnocchi before). Another solution may be to split use cases and to use gnocchi with an in-memory store for short lived telemetry data and a different backend for long time store, for example for billing data etc. If we stop providing the gnocchi API (or don't bring the old ceilometer API back), we'll also cut off applications currently using that API. All that requires some coordination. Does that explain my preference for a call?
I don't know any time series db (apart from Gnocchi) that check all the bullets above. Do you?
Personally, I wouldn't check gnocchi on the scalable bullet. (I'm eager to hear your success stories, that would make things much easier). Or let me rephrase this: it seems to be easier to achieve ingesting much more metrics per time interval e.g by using prometheus and at the same time using less hardware resources. Matthias
On 11/20/20 3:48 PM, Matthias Runge wrote:
If it works for you, let's use this bluejeans channel[4] for the discussion.
Gosh, reading that you'd be using bluejeans, then definitively, I will not be able to join the call... :/
Cheers,
Thomas Goirand (zigo)
On 11/23/20 8:54 AM, Matthias Runge wrote:
Personally, I wouldn't check gnocchi on the scalable bullet. (I'm eager to hear your success stories, that would make things much easier).
Maybe you should read this: https://julien.danjou.info/gnocchi-4-performance/ Julien Danjou pretends Gnocchi is able to eat 100k record per second. I wouldn't bet on this, but that's probably enough for OpenStack. Though the biggest bottleneck is probably the notification bus, not the time series. I've heard that switching to kafka may help, but I'm really not a fan of this "solution" (which IMO isn't one, as Kafka brings its own set of problems).
Or let me rephrase this: it seems to be easier to achieve ingesting much more metrics per time interval e.g by using prometheus and at the same time using less hardware resources.
Prometheus is already in Debian, and it's been there for the last 2 releases, so I'd be ok switching to it. However, you'd have to provide a migration path for those already using Gnocchi. Cheers, Thomas Goirand (zigo)
On 23/11/2020 09:30, Thomas Goirand wrote:
On 11/23/20 8:54 AM, Matthias Runge wrote:
Personally, I wouldn't check gnocchi on the scalable bullet. (I'm eager to hear your success stories, that would make things much easier).
Maybe you should read this: https://julien.danjou.info/gnocchi-4-performance/
Julien Danjou pretends Gnocchi is able to eat 100k record per second. I wouldn't bet on this, but that's probably enough for OpenStack.
Though the biggest bottleneck is probably the notification bus, not the time series. I've heard that switching to kafka may help, but I'm really not a fan of this "solution" (which IMO isn't one, as Kafka brings its own set of problems).
all of this should be discussed in the community and also decided on. If we pick gnocchi, then we should also buy in and help out. I know that friendly and fellow OpenStackers already contribute there, which doesn't mean that there could or should be more helping hands.
Or let me rephrase this: it seems to be easier to achieve ingesting much more metrics per time interval e.g by using prometheus and at the same time using less hardware resources.
Prometheus is already in Debian, and it's been there for the last 2 releases, so I'd be ok switching to it. However, you'd have to provide a migration path for those already using Gnocchi.
If there has to be a migration path, that's also up for discussion. We are not a service provider here, we are simply providing the software. And as for all software, a new release may also bring backwards incompatible changes.... Matthias
On Thu, Nov 12, 2020 at 11:25:09AM +0100, Matthias Runge wrote:
Hi there,
one of the biggest challenges for the telemetry stack is currently the state of gnocchi, which is... undefined/unfortunate/under-contributed/...?
Hi there, we had our planning/brainstorming, where we talked about the current state and the situation with gnocchi specifically. The situation can probably be described as: gnocchi fills a gap we couldn't satisfy to 100% if we would have to replace it. Also it is seen as well performing and functional. We also talked about the discussion around moving gnocchi back under openinfra and what this would solve. It is well understood, that the initial gnocchi project contributors are not paid for any work they do on gnocchi. We agreed to address the concern "we can not merge anything in gnocchi" by contributing to gnocchi for the next cycle and to revisit the situation next cycle. Julien, Mehdi and also Gord mentioned(iirc), that a bar for getting merge permission for a project with only a few contributors is quite low. A few PRs against gnocchi were mentioned, which were contributed but not reviewed/merged yet. * https://github.com/gnocchixyz/gnocchi/pull/1059 * https://github.com/gnocchixyz/gnocchi/pull/1062 * https://github.com/gnocchixyz/gnocchi/pull/1056 (this one was merged) * https://github.com/gnocchixyz/python-gnocchiclient/pull/104 We'll take a look and see how to proceed here. Gnocchis CI system apparently needs some work. On a related note, we are actively seeking to increase the collaboration with related projects or sigs, for example with CloudKitty. [1] https://etherpad.opendev.org/p/telemetry-wallaby-topics Matthias -- Matthias Runge <mrunge@matthias-runge.de>
participants (5)
-
Jeremy Stanley
-
Matthias Runge
-
Rafael Weingärtner
-
Thomas Goirand
-
Tim Bell