Open Stack

Thu Apr 28 12:46:16 UTC 2016

Hi,

On Thu, Apr 28, 2016 at 5:13 AM, Andy Botting <andy at andybotting.com> wrote:

> We're running our services clustered behind an F5 loadbalancer in
> production, and haproxy in our testing environment. This setup works quite
> well for us, but I'm not that happy with testing the health of our
> endpoints.
>
> We're currently calling basic URLs like / or /v2 etc and some services
> return a 200, some return other codes like 401. Our healthcheck test simply
> checks against whatever the http code returns. This works OK and does catch
> basic service failure.
>
> Our test environment is on flaky hardware and often fails in strange ways
> and sometimes the port is open and basic URLs work, but actually doing real
> API calls fail and timeout, so our checks fall down here.
>
> In a previous role I had, the developers added a url (e.g. /healthcheck)
> to each web application which went through and tested things like the db
> connection was OK, memcached was accessible, etc and returned a 200. This
> worked out really great for operations. I haven't seen anything like this
> for OpenStack.
>
>
There's a healthcheck oslo.middleware plugin [1] available. So you could
possibly configure the service pipeline to include this except it won't
exercise the db connection, RabbitMQ connection, and so on. But it would
help if you want to kick out a service instance from the load-balancer
without stopping the service completely [2].

[1]
http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html
[2]
http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html#disable-by-file

> I'm wondering how everyone else does healthchecking of their clustered
> services, and whether or not they think adding a dedicated heathcheck URL
> would be beneficial?
>

>From what I can tell, people are doing the same thing as you do: check that
a well-known location ('/', '/v2' or else) returns the expected code and
hope that it will work for real user requests too.

Simon

>
> We do use scripts similar to ones in the osops-tools-monitoring in Nagios
> which help with more complex testing, but I'm thinking of something more
> lightweight specifically for setting up on loadbalancers.
>
> cheers,
> Andy
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160428/4d7317f3/attachment.html>

Open Stack

[Openstack-operators] Healthcheck URLs for services

OpenStack

Community

Documentation

Branding & Legal