<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 05/13/2015 09:51 AM, Simon Pasquier
wrote:<br>
</div>
<blockquote
cite="mid:CAOq3GZU1cB+fAJMxKAZ86+p_69s2Px5YsdVKTTjS0gyav=y0Sg@mail.gmail.com"
type="cite">
<meta http-equiv="Context-Type" content="text/html; charset=UTF-8">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, May 13, 2015 at 3:27 PM,
David Kranz <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:dkranz@redhat.com" target="_blank">dkranz@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote">
<div><span class="">
<div>On 05/13/2015 09:06 AM, Simon Pasquier wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>Hello,<br>
<br>
Like many others commented before, I don't
quite understand how unique are the
Cloudpulse use cases.<br>
<br>
For operators, I got the feeling that
existing solutions fit well:<br>
- Traditional monitoring tools (Nagios,
Zabbix, ....) are necessary anyway for
infrastructure monitoring (CPU, RAM,
disks, operating system, RabbitMQ,
databases and more) and diagnostic
purposes. Adding OpenStack service checks
is fairly easy if you already have the
toolchain.<br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</span> Is it really so easy? Rabbitmq has an
"aliveness" test that is easy to hook into. I don't know
exactly what it does, other than what the doc says, but
I should not have to. If I want my standard monitoring
system to call into a cloud and ask "is nova healthy?",
"is glance healthy?", etc. are their such calls? <br>
</div>
</blockquote>
<div><br>
</div>
<div>Regarding RabbitMQ aliveness test, it has its own
limits (more on that latter, I've got an "interesting"
RabbitMQ outage that I'm going to discuss in a new thread)
and it doesn't replicate exactly what the clients (eg
OpenStack services) are doing.<br>
</div>
</div>
</div>
</div>
</blockquote>
I'm sure it has limits but my point was that the developers of
rabbitmq understood that it would be difficult for users to know
exactly what should be poked at inside to check health, so they
provide a call to do it. <br>
<blockquote
cite="mid:CAOq3GZU1cB+fAJMxKAZ86+p_69s2Px5YsdVKTTjS0gyav=y0Sg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> <br>
</div>
<div>Regarding the service checks, there are already plenty
of scripts that exist for Nagios, Collectd and so on. Some
of them are listed in the Wiki [1].<br>
</div>
</div>
</div>
</div>
</blockquote>
I understand and that is what I meant by "after-market". If some one
puts a new feature in service X, that requires some monitoring to
be healthy, then all those different scripts need to chase after it
to keep up to date. Poking at service internals to check the health
of a service is an abstraction violation. As some one on this thread
said, tempest/rally can be used to check a certain kind of health
but it is akin to black-box testing whereas health monitoring should
be more akin to whitebox-testing.<br>
<blockquote
cite="mid:CAOq3GZU1cB+fAJMxKAZ86+p_69s2Px5YsdVKTTjS0gyav=y0Sg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<blockquote class="gmail_quote">
<div> <br>
There are various sets of calls associated with nagios,
zabbix, etc. but those seem like "after-market" parts
for a car. Seems to me the services themselves would
know best how to check if they are healthy, particularly
as that could change version to version. Has their been
discussion of adding a health-check (admin) api in each
service? Lacking that, is there documentation from any
OpenStack projects about "how to check the health of
nova"? When I saw this thread start, that is what I
thought it was going to be about.<span class=""><br>
</span></div>
</blockquote>
<div><br>
</div>
<div>Starting with Kilo, you could configure your OpenStack
API services with the healthcheck middleware [2]. This has
been inspired by what Swift's been doing for some time now
[3].IIUC the default healthcheck is minimalist and doesn't
check that dependent services (like RabbitMQ, database)
are healthy but the framework is extensible and more
healthchecks can be added.<br>
</div>
</div>
</div>
</div>
</blockquote>
I can see that but the real value would be in abstracting the
details of what it means for a service to be healthy inside the
implementation and exporting an api. If that were present, the
question of whether calling it used middleware or not would be
secondary. I'm not sure what the value-add of middleware would be in
this case.<br>
<br>
-David<br>
<br>
<br>
<br>
<br>
<blockquote
cite="mid:CAOq3GZU1cB+fAJMxKAZ86+p_69s2Px5YsdVKTTjS0gyav=y0Sg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote">
<div><span class=""> <br>
-David</span>
<div>
<div class="h5"><br>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>BR,<br>
</div>
<div>Simon<br>
</div>
<div><br>
[1] <a moz-do-not-send="true"
href="https://wiki.openstack.org/wiki/Operations/Tools#Monitoring_and_Trending">https://wiki.openstack.org/wiki/Operations/Tools#Monitoring_and_Trending</a><br>
[2] <a moz-do-not-send="true"
href="http://docs.openstack.org/developer/oslo.middleware/api.html#oslo_middleware.Healthcheck">http://docs.openstack.org/developer/oslo.middleware/api.html#oslo_middleware.Healthcheck</a><br>
[3] <a moz-do-not-send="true"
href="http://docs.openstack.org/kilo/config-reference/content/object-storage-healthcheck.html">http://docs.openstack.org/kilo/config-reference/content/object-storage-healthcheck.html</a><br>
</div>
<blockquote class="gmail_quote">
<div>
<div>
<div class="h5"> <br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>- OpenStack projects like Rally or
Tempest can generate synthetic loads and
run end-to-end tests. Integrating them
with a monitoring system isn't terribly
difficult either.<br>
</div>
</div>
<br>
As far as Monitoring-as-a-service is
concerned, do you have plans to
integrate/leverage Ceilometer?<br>
<br>
</div>
BR,<br>
</div>
Simon</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, May 12, 2015 at
7:20 PM, Vinod Pandarinathan (vpandari) <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:vpandari@cisco.com"
target="_blank">vpandari@cisco.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote">
<div>
<div>
<div> <span>Hello,</span></div>
<div> <br>
</div>
<div> I'm pleased to announce the
development of a new project called
CloudPulse. CloudPulse provides
Openstack</div>
<div> <span>health-checking services to
both operators, tenants, and
applications. This project will
begin as </span></div>
<div> <span>a StackForge project based
upon an empty cookiecutter[1] repo.
The repos to work in are:</span></div>
<div> <span>Server: </span><span><a
moz-do-not-send="true"
href="https://github.com/stackforge/cloudpulse"
target="_blank">https://github.com/stackforge/cloudpulse</a></span></div>
<div> <span>Client: </span><span><a
moz-do-not-send="true"
href="https://github.com/stackforge/python-cloudpulseclient"
target="_blank">https://github.com/stackforge/python-cloudpulseclient</a></span></div>
<div> <br>
</div>
<div> <span>Please join us via iRC on
#openstack-cloudpulse on freenode.</span></div>
<div> <br>
</div>
<div> <span>I am holding a doodle poll
to select times for our first
meeting the week after summit. This
doodle poll will close May 24th and
meeting times will be announced on
the mailing list at that time. At
our first IRC meeting, </span></div>
<div> <span>we will draft additional
core team members, so if your
interested in joining a fresh new
development effort, please attend
our first meeting. </span></div>
<div> Please take a moment if your
interested in CloudPulse to fill out
the doodle poll here: </div>
<div> <br>
</div>
<div> <span><a moz-do-not-send="true"
href="https://doodle.com/kcpvzy8kfrxe6rvb"
target="_blank">https://doodle.com/kcpvzy8kfrxe6rvb</a></span></div>
<div> <br>
</div>
<div> The initial core team is composed
of</div>
<div> <span>Ajay Kalambur, </span></div>
<div> <span>Behzad Dastur, </span><span>Ian
Wells, </span><span>Pradeep
chandrasekhar, </span><span>Steven
Dake</span><span> and</span><span>
Vinod Pandarinathan</span><span>.</span><span>
<br>
</span></div>
<div> <span>I expect more members to
join during our initial meeting.</span></div>
<div> <br>
</div>
<div> A little bit about CloudPulse:</div>
<div> <span> Cloud operators need
notification of OpenStack failures
before a customer reports the
failure. Cloud operators can then
take timely corrective actions with
minimal disruption to applications.
Many cloud applications, including </span></div>
<div> <span>those I am interested in
(NFV) have very stringent service
level agreements. Loss of service
can trigger contractual</span></div>
<div> <span>costs associated with the
service. Application high
availability requires an operational
OpenStack Cloud, and the reality</span></div>
<div> <span>is that occascionally
OpenStack clouds fail in some
mysterious ways. This project
intends to identify when those
failures </span></div>
<div> <span>occur so corrective actions
may be taken by operators, tenants,
and the applications themselves.</span></div>
<div> <span><br>
</span></div>
<div> <span></span>OpenStack is
considered healthy when OpenStack API
services respond appropriately.
Further OpenStack is</div>
<div> <span>healthy when network
traffic can be sent between the
tenant networks and </span><span>can
access the Internet. Finally
OpenStack</span></div>
<div> <span>is healthy when all
infrastructure cluster elements are
in an operational state.</span></div>
<div> <br>
</div>
<div> <span>For information about
blueprints check out:</span></div>
<div> <span> </span><span><a
moz-do-not-send="true"
href="https://blueprints.launchpad.net/cloudpulse"
target="_blank">https://blueprints.launchpad.net/cloudpulse</a></span></div>
<div> <span><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/python-cloudpulseclient"
target="_blank">https://blueprints.launchpad.net/python-cloudpulseclient</a></span></div>
<div> <br>
</div>
<div> For more details, check out our
Wiki:</div>
<div> <span><a moz-do-not-send="true"
href="https://wiki.openstack.org/wiki/Cloudpulse"
target="_blank">https://wiki.openstack.org/wiki/Cloudpulse</a></span></div>
<div> <br>
</div>
<div> Plase join the CloudPulse team in
designing and implementing a
world-class Carrier Grade system for
checking</div>
<div> <span>the health of OpenStack
clouds. We look forward to seeing
you on IRC on #openstack-cloudpulse.</span></div>
<div> <br>
</div>
<div> Regards,</div>
<div> <span>Vinod Pandarinathan</span></div>
<div> <span>[1] </span><span><a
moz-do-not-send="true"
href="https://github.com/openstack-dev/cookiecutter"
target="_blank">https://github.com/openstack-dev/cookiecutter</a></span></div>
</div>
<div><br>
</div>
</div>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for
usage questions)<br>
Unsubscribe: <a moz-do-not-send="true"
href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe"
target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a moz-do-not-send="true"
href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: <a moz-do-not-send="true" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>
<a moz-do-not-send="true" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage
questions)<br>
Unsubscribe: <a moz-do-not-send="true"
href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe"
target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a moz-do-not-send="true"
href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: <a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>