[openstack-dev] [kolla] Monitoring tooling

Michał Jastrzębski inc007 at gmail.com
Sun Jul 24 15:05:54 UTC 2016


Guys, thanks for all that!

Can we for a second abstract this discussion from technology and start
by lining up scenerios we want to achieve. Then put a software that
will allow us to achieve all/most of scenerios with least amount of
work/maintenance?

So my scenerios:

I want to see when health of docker service
I want to see when message queue becomes saturated
I want to see when RAM exceeds 70%
I want to see when my network causes tons of retransmissions
I want to see when one of nodes is down

Did I miss anything? Which software stack would allow me to see these things?

Cheers,
Michal

On 24 July 2016 at 09:09, Mathias Ewald <mewald at evoila.de> wrote:
> I think Sensu is the best monitoring approach out there atm. Nagios / Icinga
> are way to static and scale badly imho.  The kind of checks you proposed are
> quite interesting. I would suggest to wrap a sensu check around Tempest but
> that's going to far for the first cycle.
>
> The two stacks (Sensu + Unchiwa and TICK) only really overlap in metrics
> collection which can be done via Sensu and Telegraf. I don't know if it
> makes sense to have both ... I definitely think we need Sensu though simply
> to monitor service availability and other thresholds and events which aren't
> covered in TICK as not everything is time series data and to have the
> alerting. Only with Sensu we don't have insight into performance and trends,
> with TICK only we lack alerting on events and non-performance metric data
> (Is Keystone up? etc)
>
> I think it won't hurt to develop theses two stacks in parallel and maybe
> we'll join them together in a chain as I described earlier.
>
> 2016-07-24 14:25 GMT+02:00 Dave Walker <email at daviey.com>:
>>
>> Thanks Mathias,
>>
>> I'm not tied to Sensu.. anything can really fill that gap in my mind.
>> You've done a good job at outlining the steps involved.  I created a
>> blueprint with the steps I had in mind[0]
>>
>> For this cycle, I wanted to keep it simple so it was easily achievable.  I
>> only planned to have some basic up/down for each node and throw the
>> performance data on the floor.
>>
>> I wanted to include the option to include local configs, as json blobs.
>> Some of the things I was thinking as local config:
>>   - daily checkouts, can instances be built with networking
>>   - remaining resources count (ie, does each subnet have X remaining ip
>> addresses available)
>>   - Is Ceph healthy?
>>
>> So, these things aren't really performance over time interesting.. which
>> means the intention does differ.  However, I do agree that both stacks could
>> achieve both objectives.
>>
>> I've essentially got much of this working locally, but would require about
>> a day of cleaning up for submission... but if your work can achieve the
>> objectives above, i'm happy to discontinue... or help make your stack
>> pluggable.
>>
>> [0] https://blueprints.launchpad.net/kolla/+spec/sensu
>>
>> --
>> Kind Regards,
>> Dave Walker
>>
>> On 24 July 2016 at 11:56, Mathias Ewald <mewald at evoila.de> wrote:
>>>
>>> Monitoring is a difficult topic as the number of options regarding the
>>> toolset and mechanisms are very high. We had some chats about it in IRC that
>>> discovered even more options than I thought existed :D I believe Dave's view
>>> on Sensu is generally correct in that Sensu is more directed to monitoring
>>> in the form of "if X running/working" but of course has the ability to
>>> transport metrics, too, but lacks the good dashboarding capabilities for
>>> performance data. One set up I could images is
>>>
>>> 1. Sensu Client to collect checks and metrics
>>> 2. RabbitMQ for transport
>>> 3. Sensu Server to receive, evaluate, alarm and write metrics to InfluxDB
>>> 4. Uchiwa as a Dashboard to Sensu
>>> 5. InfluxDB to store metrics
>>> 6. Grafana to dashboard metrics
>>>
>>> So Sensu could be used as a replacement for (or in addition to) a metrics
>>> collection daemon like Collectd or what I decided to use: Telegraf. For my
>>> implementation, this means I will add a parameter to make Telegraf optional.
>>> This way, someone else may implement the rest of the stack and the user can
>>> decide which one to use.
>>>
>>> What do you think?
>>>
>>> Mathias
>>>
>>>
>>>
>>> 2016-07-23 21:51 GMT+02:00 Stephen Hindle <shindle at llnw.com>:
>>>>
>>>> My understanding was Sensu could produce metrics ?
>>>> And Kapacitor can do alerting for the TICK stack stuff mewald is
>>>> doing...
>>>> I really don't see them as that different ?
>>>>
>>>>
>>>> On Fri, Jul 22, 2016 at 5:19 PM, Dave Walker <email at daviey.com> wrote:
>>>> > Yes, this is my thought.
>>>> >
>>>> > The scope of the Sensu work is: "Is this thing working?" (with the
>>>> > reference
>>>> > being up/down)
>>>> > But the scope of the Grafana and friends is, "How hard is this
>>>> > working?"
>>>> > (but no alerting)
>>>> >
>>>> > They are certainly complementary.... However, Sensu can throw data at
>>>> > a
>>>> > Grafana stack (aiui).. but I fear that is too much to achieve this
>>>> > cycle.
>>>> >
>>>> > --
>>>> > Kind Regards,
>>>> > Dave Walker
>>>> >
>>>> > On 23 July 2016 at 00:11, Fox, Kevin M <Kevin.Fox at pnnl.gov> wrote:
>>>> >>
>>>> >> I think those are two different, complementary things.
>>>> >>
>>>> >> One's metrics and the other is monitoring. You probably want both at
>>>> >> the
>>>> >> same time.
>>>> >>
>>>> >> Thanks,
>>>> >> Kevin
>>>> >> ________________________________________
>>>> >> From: Steven Dake (stdake) [stdake at cisco.com]
>>>> >> Sent: Friday, July 22, 2016 3:52 PM
>>>> >> To: OpenStack Development Mailing List (not for usage questions)
>>>> >> Subject: Re: [openstack-dev] [kolla] Monitoring tooling
>>>> >>
>>>> >> Thanks for pointing that out.  Brain out to lunch today it appears :(
>>>> >>
>>>> >> I think choices are a good thing even though they increase our
>>>> >> implementation footprint.  Anyone opposed to implementing both with
>>>> >> something in globals.yml like
>>>> >> monitoring: grafana or
>>>> >> monitoring: sensu
>>>> >>
>>>> >> Comments questions or concerns welcome.
>>>> >>
>>>> >> Regards
>>>> >> -steve
>>>> >>
>>>> >> On 7/22/16, 3:42 PM, "Stephen Hindle" <shindle at llnw.com> wrote:
>>>> >>
>>>> >> >Don't forget mewalds implementation as well - we now have 2
>>>> >> > monitoring
>>>> >> >options for kolla :-)
>>>> >> >
>>>> >> >On Fri, Jul 22, 2016 at 3:15 PM, Steven Dake (stdake)
>>>> >> > <stdake at cisco.com>
>>>> >> >wrote:
>>>> >> >> Hi folks,
>>>> >> >>
>>>> >> >> At the midcycle we decided to push off implementing Monitoring
>>>> >> >> until
>>>> >> >>post
>>>> >> >> Newton.  The rationale for this decision was that the core review
>>>> >> >> team
>>>> >> >>has
>>>> >> >> enough on their plates and nobody was super keen to implement any
>>>> >> >>monitoring
>>>> >> >> solution given our other priorities.
>>>> >> >>
>>>> >> >> Like all good things, communities produce new folks that want to
>>>> >> >> do new
>>>> >> >> things, and Sensu was proposed as Kolla's monitoring solution
>>>> >> >> (atleast
>>>> >> >>the
>>>> >> >> first one).  A developer that has done some good work has shown up
>>>> >> >> to
>>>> >> >>do the
>>>> >> >> job as well :)  I have heard good things about Sensu, minus the
>>>> >> >> fact
>>>> >> >>that it
>>>> >> >> is implemented in Ruby and I fear it may end up causing our gate a
>>>> >> >> lot
>>>> >> >>of
>>>> >> >> hassle.
>>>> >> >>
>>>> >> >> https://review.openstack.org/#/c/341861/
>>>> >> >>
>>>> >> >>
>>>> >> >> Anyway I think we can work through the gate problem.
>>>> >> >>
>>>> >> >> Does anyone have any better suggestion?  I'd like to unblock
>>>> >> >> Dave's
>>>> >> >> work
>>>> >> >> which is blocked on a ­2 pending a complete discussion of our
>>>> >> >> monitoring
>>>> >> >> solution.  Note we may end up implementing more than one down the
>>>> >> >> road
>>>> >> >> ­
>>>> >> >> Sensu is just where the original interest was.
>>>> >> >>
>>>> >> >> Please provide feedback, even if you don't have a preference,
>>>> >> >> whether
>>>> >> >>your a
>>>> >> >> core reviewer or not.
>>>> >> >>
>>>> >> >> My take is we can merge this work in non-prioirty order, and if it
>>>> >> >>makes the
>>>> >> >> end of the cycle fantastic ­ if not, we can release it in Ocatta.
>>>> >> >>
>>>> >> >> Regards
>>>> >> >> -steve
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >>
>>>> >> >>
>>>> >> >> >> >>_________________________________________________________________________
>>>> >> >>_
>>>> >> >> OpenStack Development Mailing List (not for usage questions)
>>>> >> >> Unsubscribe:
>>>> >> >>OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >--
>>>> >> >Stephen Hindle - Senior Systems Engineer
>>>> >> >480.807.8189 480.807.8189
>>>> >> >www.limelight.com Delivering Faster Better
>>>> >> >
>>>> >> >Join the conversation
>>>> >> >
>>>> >> >at Limelight Connect
>>>> >> >
>>>> >> >--
>>>> >> >The information in this message may be confidential.  It is intended
>>>> >> >solely
>>>> >> >for
>>>> >> >the addressee(s).  If you are not the intended recipient, any
>>>> >> > disclosure,
>>>> >> >copying or distribution of the message, or any action or omission
>>>> >> > taken
>>>> >> >by
>>>> >> >you
>>>> >> >in reliance on it, is prohibited and may be unlawful.  Please
>>>> >> > immediately
>>>> >> >contact the sender if you have received this message in error.
>>>> >> >
>>>> >> >
>>>> >>
>>>> >> >
>>>> >> > > >__________________________________________________________________________
>>>> >> >OpenStack Development Mailing List (not for usage questions)
>>>> >> >Unsubscribe:
>>>> >> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> >> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >>
>>>> >>
>>>> >>
>>>> >> __________________________________________________________________________
>>>> >> OpenStack Development Mailing List (not for usage questions)
>>>> >> Unsubscribe:
>>>> >> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >>
>>>> >>
>>>> >> __________________________________________________________________________
>>>> >> OpenStack Development Mailing List (not for usage questions)
>>>> >> Unsubscribe:
>>>> >> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > __________________________________________________________________________
>>>> > OpenStack Development Mailing List (not for usage questions)
>>>> > Unsubscribe:
>>>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Stephen Hindle - Senior Systems Engineer
>>>> 480.807.8189 480.807.8189
>>>> www.limelight.com Delivering Faster Better
>>>>
>>>> Join the conversation
>>>>
>>>> at Limelight Connect
>>>>
>>>> --
>>>> The information in this message may be confidential.  It is intended
>>>> solely
>>>> for
>>>> the addressee(s).  If you are not the intended recipient, any
>>>> disclosure,
>>>> copying or distribution of the message, or any action or omission taken
>>>> by
>>>> you
>>>> in reliance on it, is prohibited and may be unlawful.  Please
>>>> immediately
>>>> contact the sender if you have received this message in error.
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>>
>>>
>>> --
>>> Mobil: +49 176 10567592
>>> E-Mail: mewald at evoila.de
>>>
>>> evoila GmbH
>>> Wilhelm-Theodor-Römheld-Str. 34
>>> 55130 Mainz
>>> Germany
>>>
>>> Geschäftsführer: Johannes Hiemer
>>>
>>> Amtsgericht Mainz HRB 42719
>>>
>>> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
>>> Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
>>> irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
>>> vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
>>> Weitergabe dieser Mail ist nicht gestattet.
>>>
>>> This e-mail may contain confidential and/or privileged information. If
>>> You are not the intended recipient (or have received this e-mail in error)
>>> please notify the sender immediately and destroy this e-mail. Any
>>> unauthorised copying, disclosure or distribution of the material in this
>>> e-mail is strictly forbidden.
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
> --
> Mobil: +49 176 10567592
> E-Mail: mewald at evoila.de
>
> evoila GmbH
> Wilhelm-Theodor-Römheld-Str. 34
> 55130 Mainz
> Germany
>
> Geschäftsführer: Johannes Hiemer
>
> Amtsgericht Mainz HRB 42719
>
> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
> Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
> irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
> vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
> Weitergabe dieser Mail ist nicht gestattet.
>
> This e-mail may contain confidential and/or privileged information. If You
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy this e-mail. Any
> unauthorised copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list