[openstack-dev] [new][cloudpulse] Announcing a project to HealthCheck OpenStack deployments

David Kranz dkranz at redhat.com
Wed May 13 13:27:05 UTC 2015


On 05/13/2015 09:06 AM, Simon Pasquier wrote:
> Hello,
>
> Like many others commented before, I don't quite understand how unique 
> are the Cloudpulse use cases.
>
> For operators, I got the feeling that existing solutions fit well:
> - Traditional monitoring tools (Nagios, Zabbix, ....) are necessary 
> anyway for infrastructure monitoring (CPU, RAM, disks, operating 
> system, RabbitMQ, databases and more) and diagnostic purposes. Adding 
> OpenStack service checks is fairly easy if you already have the toolchain.
Is it really so easy? Rabbitmq has an "aliveness" test that is easy to 
hook into. I don't know exactly what it does, other than what the doc 
says, but I should not have to. If I want my standard monitoring system 
to call into a cloud and ask "is nova healthy?", "is glance healthy?", 
etc. are their such calls?

There are various sets of calls associated with nagios, zabbix, etc. but 
those seem like "after-market" parts for a car. Seems to me the services 
themselves would know best how to check if they are healthy, 
particularly as that could change version to version. Has their been 
discussion of adding a health-check (admin) api in each service? Lacking 
that, is there documentation from any OpenStack projects about "how to 
check the health of nova"? When I saw this thread start, that is what I 
thought it was going to be about.

  -David

> - OpenStack projects like Rally or Tempest can generate synthetic 
> loads and run end-to-end tests. Integrating them with a monitoring 
> system isn't terribly difficult either.
>
> As far as Monitoring-as-a-service is concerned, do you have plans to 
> integrate/leverage Ceilometer?
>
> BR,
> Simon
>
> On Tue, May 12, 2015 at 7:20 PM, Vinod Pandarinathan (vpandari) 
> <vpandari at cisco.com <mailto:vpandari at cisco.com>> wrote:
>
>     Hello,
>
>       I'm pleased to announce the development of a new project called
>     CloudPulse.  CloudPulse provides Openstack
>     health-checking services to both operators, tenants, and
>     applications. This project will begin as
>     a StackForge project based upon an empty cookiecutter[1] repo. 
>     The repos to work in are:
>     Server: https://github.com/stackforge/cloudpulse
>     Client: https://github.com/stackforge/python-cloudpulseclient
>
>     Please join us via iRC on #openstack-cloudpulse on freenode.
>
>     I am holding a doodle poll to select times for our first meeting
>     the week after summit.  This doodle poll will close May 24th and
>     meeting times will be announced on the mailing list at that time.
>     At our first IRC meeting,
>     we will draft additional core team members, so if your interested
>     in joining a fresh new development effort, please attend our first
>     meeting.
>     Please take a moment if your interested in CloudPulse to fill out
>     the doodle poll here:
>
>     https://doodle.com/kcpvzy8kfrxe6rvb
>
>     The initial core team is composed of
>     Ajay Kalambur,
>     Behzad Dastur, Ian Wells, Pradeep chandrasekhar, Steven
>     DakeandVinod Pandarinathan.
>     I expect more members to join during our initial meeting.
>
>      A little bit about CloudPulse:
>      Cloud operators need notification of OpenStack failures before a
>     customer reports the failure. Cloud operators can then take timely
>     corrective actions with minimal disruption to applications. Many
>     cloud applications, including
>     those I am interested in (NFV) have very stringent service level
>     agreements.  Loss of service can trigger contractual
>     costs associated with the service.  Application high availability
>     requires an operational OpenStack Cloud, and the reality
>     is that occascionally OpenStack clouds fail in some mysterious
>     ways.  This project intends to identify when those failures
>     occur so corrective actions may be taken by operators, tenants,
>     and the applications themselves.
>
>     OpenStack is considered healthy when OpenStack API services
>     respond appropriately.  Further OpenStack is
>     healthy when network traffic can be sent between the tenant
>     networks and can access the Internet.  Finally OpenStack
>     is healthy when all infrastructure cluster elements are in an
>     operational state.
>
>     For information about blueprints check out:
>     https://blueprints.launchpad.net/cloudpulse
>     https://blueprints.launchpad.net/python-cloudpulseclient
>
>     For more details, check out our Wiki:
>     https://wiki.openstack.org/wiki/Cloudpulse
>
>     Plase join the CloudPulse team in designing and implementing a
>     world-class Carrier Grade system for checking
>     the health of OpenStack clouds.  We look forward to seeing you on
>     IRC on #openstack-cloudpulse.
>
>     Regards,
>     Vinod Pandarinathan
>     [1] https://github.com/openstack-dev/cookiecutter
>
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150513/9c1e617f/attachment.html>


More information about the OpenStack-dev mailing list