[Openstack-operators] Healthcheck URLs for services

Joshua Harlow harlowja at fastmail.com
Fri Apr 29 15:58:39 UTC 2016


Yup, that healthcheck middleware was made more advanced by me,

If u need to do anything special with it, let me know and I can help 
make that possible (or at least instruct what might need changed to do 
that).

Simon Pasquier wrote:
> Hi,
>
> On Thu, Apr 28, 2016 at 5:13 AM, Andy Botting <andy at andybotting.com
> <mailto:andy at andybotting.com>> wrote:
>
>     We're running our services clustered behind an F5 loadbalancer in
>     production, and haproxy in our testing environment. This setup works
>     quite well for us, but I'm not that happy with testing the health of
>     our endpoints.
>
>     We're currently calling basic URLs like / or /v2 etc and some
>     services return a 200, some return other codes like 401. Our
>     healthcheck test simply checks against whatever the http code
>     returns. This works OK and does catch basic service failure.
>
>     Our test environment is on flaky hardware and often fails in strange
>     ways and sometimes the port is open and basic URLs work, but
>     actually doing real API calls fail and timeout, so our checks fall
>     down here.
>
>     In a previous role I had, the developers added a url (e.g.
>     /healthcheck) to each web application which went through and tested
>     things like the db connection was OK, memcached was accessible, etc
>     and returned a 200. This worked out really great for operations. I
>     haven't seen anything like this for OpenStack.
>
>
> There's a healthcheck oslo.middleware plugin [1] available. So you could
> possibly configure the service pipeline to include this except it won't
> exercise the db connection, RabbitMQ connection, and so on. But it would
> help if you want to kick out a service instance from the load-balancer
> without stopping the service completely [2].
>
> [1]
> http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html
> [2]
> http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html#disable-by-file
>
>     I'm wondering how everyone else does healthchecking of their
>     clustered services, and whether or not they think adding a dedicated
>     heathcheck URL would be beneficial?
>
>
>  From what I can tell, people are doing the same thing as you do: check
> that a well-known location ('/', '/v2' or else) returns the expected
> code and hope that it will work for real user requests too.
>
> Simon
>
>
>     We do use scripts similar to ones in the osops-tools-monitoring in
>     Nagios which help with more complex testing, but I'm thinking of
>     something more lightweight specifically for setting up on loadbalancers.
>
>     cheers,
>     Andy
>
>     _______________________________________________
>     OpenStack-operators mailing list
>     OpenStack-operators at lists.openstack.org
>     <mailto:OpenStack-operators at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



More information about the OpenStack-operators mailing list