[openstack-dev] [new][cloudpulse] Announcing a project to HealthCheck OpenStack deployments

Georgy Okrokvertskhov gokrokvertskhov at mirantis.com
Wed May 13 02:18:08 UTC 2015


Murano itself does not provide any monitoring. The idea here is to expose
any application capabilities to do this. In this demo we had Java
application deployed on Tomcat VM and connected to PostgreDB. Java app
workflow executed Nagios application methods to register itself in Nagios
monitoring system by adding proper IP, port, URL information for standard
Nagios HTTP probes. Nagios itself has capabilities to send notifications to
any other services like e-mail, IM or custom (Murano, Heat etc via simple
bash\curl scripts).

So, if you want to have monitoring for you apps, then you probbaly will
need to modify Nagios application in murano to expose registration and
e-mail setup for end users. Then Nagios will send notifications to user
rather then to Murano.

Another option is to add specific workflows actions in Murano or register
Mistral workflow to react to Nagios monitoring event.

The last, but not the least option for application will be a set of actions
for some critical events. Application itself detects problems, like error
in DB transactions, and it sends POST request to action URL. Action will
call a workflow which will use monitoring application interface to trigger
an event in monitoring system. The idea here is that application itself
does not know beforehand which monitoring service is used, but it has a
requirement to have monitoring service available with know interface
implemented as Murano methods. I am not sure if I am good with explaining
all this :-)

Plenty of options available, but still they require some amount of work.

Thanks
Gosha

On Tue, May 12, 2015 at 5:07 PM, Fox, Kevin M <Kevin.Fox at pnnl.gov> wrote:

>  Cool, but using nagios or the like to trigger app level actions is not
> what I'm primarily interested in. Mostly the reverse. Its for the app
> definition to provide the information nessisary for a monitoring system to
> report to the user when something is very wrong and needs intervention. For
> example, the website is unresponsive because the backend database server in
> the demo goes offline, or the maximum number of servers is already been
> reached and the site responsiveness is bad due to excessive load. Murano
> doesn't do that currently, does it?
>
> Thanks,
> Kevin
>
>  ------------------------------
> *From:* Georgy Okrokvertskhov [gokrokvertskhov at mirantis.com]
> *Sent:* Tuesday, May 12, 2015 2:04 PM
> *To:* OpenStack Development Mailing List (not for usage questions)
>
> *Subject:* Re: [openstack-dev] [new][cloudpulse] Announcing a project to
> HealthCheck OpenStack deployments
>
>   Here is the way how we do VM level monitoring in application catalog.
> There is an application Nagios which will deploy a Nagios VM to the user
> tenant. And this Nagios application exposes abstracted monitoring app
> interface to add probes and checks. Another application, Ceilometer Alarm
> also allows you to use the same monitoring interface to add check for a VM.
>
>  Demo is here: https://www.youtube.com/watch?v=OvPpJd0EOFw
>
>  As usual Heat is used under the hood for infrastructure level
> management. You can add other monitoring apps like Zabbix (
> https://github.com/openstack/murano-apps/tree/master/ZabbixAgent/package,
> https://github.com/openstack/murano-apps/tree/master/ZabbixServer/package)
>
>  Thanks
> Gosha
>
> On Tue, May 12, 2015 at 1:31 PM, Fox, Kevin M <Kevin.Fox at pnnl.gov> wrote:
>
>> It totally depends on how much experience you think a tenant user has...
>>
>> If we're talking about devops, they tend to have the skills to stand up a
>> configuration management server, a monitoring server, and manage everything
>> via config management.
>>
>> If tenant users are research scientists, like some of ours, its a fair
>> amount of work to manage nagios without config management, and config
>> management is way more effort then most researchers want to put into
>> learning. That's where an app catalog becomes important, and something like
>> monitoring as a service starts to become interesting....
>>
>> Thanks,
>> Kevin
>> ________________________________________
>> From: Jay Pipes [jaypipes at gmail.com]
>> Sent: Tuesday, May 12, 2015 12:50 PM
>>  To: openstack-dev at lists.openstack.org
>> Subject: Re: [openstack-dev] [new][cloudpulse] Announcing a project to
>> HealthCheck OpenStack deployments
>>
>> On 05/12/2015 02:16 PM, Fox, Kevin M wrote:
>> > Nagios/watever As A Service would actually be very useful I think.
>>
>> I don't really understand why Nagios-as-a-Service would be useful to
>> operators. I mean, operators install their monitoring system of choice
>> via their configuration management tool of choice -- Ansible, SaltStack,
>> Puppet, Chef, etc.
>>
>> Frankly, so do tenants. Tenants install software on their images using
>> configuration management tools like mentioned above... I don't see a
>> reason to have Nagios-as-a-Service for tenants either.
>>
>> > Setting up a monitoring server is a fair amount of work.
>>
>> Not really. It's typically a simple apt-get install nagios-nrpe-plugins
>> on client VMs along with an apt-get install nagios-server on one or more
>> monitoring system VMs. Again, have configuration management systems
>> inject whatever check scripts you want paired with the ones that already
>> come with nagios-nrpe-plugins package.
>>
>>  > If "Cloud
>> > Apps" downloaded from an OpenStack Catalog had a Monitoring Heat
>> > resource built in, that would register the launched app with a
>> > multitenant aware Cloud Monitoring Service, the user would only have
>> > to launch an app, and then go into the Dashboard and associate some
>> > kind of alerting policy with the registered checks. Say, email this
>> > address when things break. That would be awesome. :)
>>
>> I guess I just don't see this being in the realm of OpenStack. Or at
>> least, not more than something like a Murano application manifest which
>> is almost what you are describing above.
>>
>> I don't see the need for this service, sorry. Not everything needs to be
>> re-invented as a RESTful Python service endpoint...
>>
>> Best,
>> -jay
>>
>> > Thanks, Kevin ________________________________________
>>
>> From: Jay Pipes [jaypipes at gmail.com] Sent: Tuesday, May 12, 2015 10:48
>> AM To:
>> > openstack-dev at lists.openstack.org Subject: Re: [openstack-dev]
>> > [new][cloudpulse] Announcing a project to HealthCheck OpenStack
>> > deployments
>> >
>> > For operators:
>> >
>> > * Nagios * Icinga * Zabbix
>> >
>> > installed on baremetal machines deployed with the OpenStack and
>> > other infrastructure services.
>> >
>> > For tenants:
>> >
>> > * Nagios * Icinga * Zabbix
>> >
>> > installed on their VMs.
>> >
>> > Why are we re-inventing excellent open-source implementations of
>> > monitoring systems that have been around for over a decade?
>> >
>> > Best, -jay
>> >
>> > p.s. Sorry for top-posting.
>> >
>> > On 05/12/2015 01:20 PM, Vinod Pandarinathan (vpandari) wrote:
>> >> Hello,
>> >>
>> >> I'm pleased to announce the development of a new project called
>> >> CloudPulse.  CloudPulse provides Openstack health-checking services
>> >> to both operators, tenants, and applications. This project will
>> >> begin as a StackForge project based upon an empty cookiecutter[1]
>> >> repo.  The repos to work in are: Server:
>> >> https://github.com/stackforge/cloudpulse Client:
>> >> https://github.com/stackforge/python-cloudpulseclient
>> >>
>> >> Please join us via iRC on #openstack-cloudpulse on freenode.
>> >>
>> >> I am holding a doodle poll to select times for our first meeting
>> >> the week after summit.  This doodle poll will close May 24th and
>> >> meeting times will be announced on the mailing list at that time.
>> >> At our first IRC meeting, we will draft additional core team
>> >> members, so if your interested in joining a fresh new development
>> >> effort, please attend our first meeting. Please take a moment if
>> >> your interested in CloudPulse to fill out the doodle poll here:
>> >>
>> >> https://doodle.com/kcpvzy8kfrxe6rvb
>> >>
>> >> The initial core team is composed of Ajay Kalambur, Behzad Dastur,
>> >> Ian Wells, Pradeep chandrasekhar, Steven DakeandVinod
>> >> Pandarinathan. I expect more members to join during our initial
>> >> meeting.
>> >>
>> >> A little bit about CloudPulse: Cloud operators need notification of
>> >> OpenStack failures before a customer reports the failure. Cloud
>> >> operators can then take timely corrective actions with minimal
>> >> disruption to applications.  Many cloud applications, including
>> >> those I am interested in (NFV) have very stringent service level
>> >> agreements.  Loss of service can trigger contractual costs
>> >> associated with the service.  Application high availability
>> >> requires an operational OpenStack Cloud, and the reality is that
>> >> occascionally OpenStack clouds fail in some mysterious ways. This
>> >> project intends to identify when those failures occur so corrective
>> >> actions may be taken by operators, tenants, and the applications
>> >> themselves.
>> >>
>> >> OpenStack is considered healthy when OpenStack API services
>> >> respond appropriately.  Further OpenStack is healthy when network
>> >> traffic can be sent between the tenant networks and can access the
>> >> Internet.  Finally OpenStack is healthy when all infrastructure
>> >> cluster elements are in an operational state.
>> >>
>> >> For information about blueprints check out:
>> >> https://blueprints.launchpad.net/cloudpulse
>> >> https://blueprints.launchpad.net/python-cloudpulseclient
>> >>
>> >> For more details, check out our Wiki:
>> >> https://wiki.openstack.org/wiki/Cloudpulse
>> >>
>> >> Plase join the CloudPulse team in designing and implementing a
>> >> world-class Carrier Grade system for checking the health of
>> >> OpenStack clouds.  We look forward to seeing you on IRC on
>> >> #openstack-cloudpulse.
>> >>
>> >> Regards, Vinod Pandarinathan [1]
>> >> https://github.com/openstack-dev/cookiecutter
>> >>
>> >>
>> >>
>> >>
>> __________________________________________________________________________
>> >>
>> >>
>> OpenStack Development Mailing List (not for usage questions)
>> >> Unsubscribe:
>> >> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >>
>> >
>> >
>> __________________________________________________________________________
>> >
>> >
>> OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> __________________________________________________________________________
>> >
>> >
>> OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
>  --
>  Georgy Okrokvertskhov
> Architect,
> OpenStack Platform Products,
> Mirantis
> http://www.mirantis.com
> Tel. +1 650 963 9828
> Mob. +1 650 996 3284
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Georgy Okrokvertskhov
Architect,
OpenStack Platform Products,
Mirantis
http://www.mirantis.com
Tel. +1 650 963 9828
Mob. +1 650 996 3284
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150512/6c55aab6/attachment.html>


More information about the OpenStack-dev mailing list