Hi, I progressed a little bit on the flipping health status. When a cluster becomes unhealthy, the curl command returns: ----- $ curl --insecure https://157.136.248.202:6443/healthz [+]ping ok [+]log ok [-]etcd failed: reason withheld ... (all others ok) ----- It lasts a few minutes and then the clusters affected become healthy again. It seems to happen several clusters at the same time. Sometimes it can remains healthy only for a few seconds/minutes and be unhealthy again and so on... It seems that the cluster (kubectl) tends to be unresponsive when transitioning from one state to the other but curl always responds... Looks as something that becomes unresponsive during some time... Any suggestion welcome! Michel Le 31/05/2024 à 18:43, Michel Jouvin a écrit :
Hi Juke,
Thanks! Yes I am using k8s_fedora_coreos_v1 driver. I'll check with curl as you suggested. Does it make sense that a check (or check status) could be updated when another cluster is created or deleted? I hardly believed it but it seems there is some "correlation"...
Best regards,
Michel Sent from my mobile
Le 31 mai 2024 18:29:59 Jake Yip <jake.yip@ardc.edu.au> a écrit :
On 1/6/2024 2:08 am, Michel Jouvin wrote:
Conversely to what I was saying initially, if creating or deleting a cluster seems to cause some update in the health state of other clusters, it doesn't seem to be the cause. I have seen that it is changing quite regularly on a test cloud with no activity and I'm really wondering what could be the cause for this? I don't seen anything in OpenStack config/logs to explain that. A network issue?
Michel
Le 31/05/2024 à 16:32, Michel Jouvin a écrit :
An aside question: what is running the health status check and is there a way to force it to run again?
Michel
Hi Jouvin,
You have no indicated what driver you may be using, is this the k8s_fedora_coreos_v1 driver?
If so, health checks are done in a period loop by the conductors. They need to be able to poll the /healthz endpoint of your kubernetes api server. You can check with curl -k https://<API_IP>:6443/healthz, where https://<API_IP>:6443 is the server in your kubeconfig.
- Jake