[tripleo] Expose healthchecks via systemd for podman
The healthchecks currently in use in tripleo depend on the the docker healthcheck implementation, so podman/oci containers will not expose them. The /openstack/healthcheck script is still available in the images, we just need a way to run the checks and expose the result. https://review.openstack.org/#/c/620372/ proposes having paunch write out a $service-healthcheck systemd unit and corresponding timer. Interested in what folks think about the approach. -Jill
On Thu, 2018-11-29 at 17:21 -0700, Jill Rouleau wrote:
The healthchecks currently in use in tripleo depend on the the docker healthcheck implementation, so podman/oci containers will not expose them. The /openstack/healthcheck script is still available in the images, we just need a way to run the checks and expose the result.
https://review.openstack.org/#/c/620372/ proposes having paunch write out a $service-healthcheck systemd unit and corresponding timer.
With Docker our health checks weren't used for anything other than information if I recall. Is the idea here that you would use systemd to obtain the healthcheck status for any given service? And then monitoring tools could call that periodically as a crude means of obtaining the status of each service? Or is the intent here to go a step further and have systemd take some kind of action if things are healthy like restarting the service. Dan
Interested in what folks think about the approach.
-Jill
On Fri, 2018-11-30 at 08:13 -0500, Dan Prince wrote:
On Thu, 2018-11-29 at 17:21 -0700, Jill Rouleau wrote:
The healthchecks currently in use in tripleo depend on the the docker healthcheck implementation, so podman/oci containers will not expose them. The /openstack/healthcheck script is still available in the images, we just need a way to run the checks and expose the result.
https://review.openstack.org/#/c/620372/ proposes having paunch write out a $service-healthcheck systemd unit and corresponding timer.
With Docker our health checks weren't used for anything other than information if I recall. Is the idea here that you would use systemd to obtain the healthcheck status for any given service? And then monitoring tools could call that periodically as a crude means of obtaining the status of each service?
Right, for now it's just to maintain the functionality of providing a means to run the check on a schedule and see the check output without having to rewrite the healthchecks, extend podman/varlink, or build significant new monitoring interfaces.
Or is the intent here to go a step further and have systemd take some kind of action if things are healthy like restarting the service.
It's an option we could do in the future, by using systemd's OnFailure=$do_something, but beyond the scope of this initial patch.
Dan
Interested in what folks think about the approach.
-Jill
participants (2)
-
Dan Prince
-
Jill Rouleau