[Victoria][magnum][octavia]ingress-controller health degraded
Luke Camilleri
luke.camilleri at zylacomputing.com
Fri Jun 4 15:16:18 UTC 2021
Hi Everyone, we have the following problem that we are trying to
identify the main cause:
Basically we have deployed an ingress and an ingress-controller (using
the following deployment file
https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.44.0/deploy/static/provider/cloud/deploy.yaml)
the ingress controller deployment is successful with 1 replica of the
ingress-controller pod, the Octavia LoadBalancer is successfully created
and points to the NodePorts being published on each node. This was
showing only 1 member in the LoadBalancers screen as healthy/online.
I increased the replicas to 3, from the LoadBalancers screen in Horizon,
I can see the service being reported as degraded and only the
kubernetes-worker-nodes that have the ingress-controller pod/s deployed
on are being reported as online, this behaviour is not the same as a
standard deployment where the NodePort actually communicates with the
ClusterIP:Port of the internal service and hence once there is a single
pod UP the NodePorts are shown as up when queried:
ingress-nginx-controller-74fd5565fb-d86h9 1/1 Running 0
14h 10.100.3.13 k8s-c1-prod-2-klctfd24lze6-node-1 <none> <none>
ingress-nginx-controller-74fd5565fb-h9985 1/1 Running 0
15h 10.100.1.8 k8s-c1-prod-2-klctfd24lze6-node-0 <none> <none>
ingress-nginx-controller-74fd5565fb-qkddq 1/1 Running 0
15h 10.100.1.7 k8s-c1-prod-2-klctfd24lze6-node-0 <none> <none>
The below shows the status of the members in the pool as replica3:
| 834750fe-e43e-408d-abc3-aad3dcde0fdb | member_0_node-0 | id |
ACTIVE | 192.168.1.75 | 32054 | ONLINE
| 1 |
| 1ddffd80-acae-40b3-a2de-19be0a69a039 | member_0_node-2 | id |
ACTIVE | 192.168.1.90 | 32054 | ERROR
| 1 |
| d4e4baa4-0a69-4775-8ea0-165a207f11ae | member_0_node-1| id |
ACTIVE | 192.168.1.148 | 32054 | ONLINE
| 1 |
In fact to have the deployment spread across all 3 nodes, I had to
increase the replicas until all 3 nodes had at least an instance of the
ingress controller running on them (in this case it was replica 5).
I do not believe this as being an Octavia issue as the health check is
being done via TCP port number which is the NodePort exposed by
Kubernetes and if the ingress-controller is not running on that node the
port check fails, I added the label octavia mainly to get some input
that may confirm the correct behavior of Octavia
I am expecting to receive a healthy state when i check the members of
the pool since I can query the ClusterIP from any worker node on ports
80 and 443 and the outcome is always successful but not when using the
NodePort
Thanks in advance
More information about the openstack-discuss
mailing list