[Victoria][magnum][octavia]ingress-controller health degraded
Hi Everyone, we have the following problem that we are trying to identify the main cause: Basically we have deployed an ingress and an ingress-controller (using the following deployment file https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.44....) the ingress controller deployment is successful with 1 replica of the ingress-controller pod, the Octavia LoadBalancer is successfully created and points to the NodePorts being published on each node. This was showing only 1 member in the LoadBalancers screen as healthy/online. I increased the replicas to 3, from the LoadBalancers screen in Horizon, I can see the service being reported as degraded and only the kubernetes-worker-nodes that have the ingress-controller pod/s deployed on are being reported as online, this behaviour is not the same as a standard deployment where the NodePort actually communicates with the ClusterIP:Port of the internal service and hence once there is a single pod UP the NodePorts are shown as up when queried: ingress-nginx-controller-74fd5565fb-d86h9 1/1 Running 0 14h 10.100.3.13 k8s-c1-prod-2-klctfd24lze6-node-1 <none> <none> ingress-nginx-controller-74fd5565fb-h9985 1/1 Running 0 15h 10.100.1.8 k8s-c1-prod-2-klctfd24lze6-node-0 <none> <none> ingress-nginx-controller-74fd5565fb-qkddq 1/1 Running 0 15h 10.100.1.7 k8s-c1-prod-2-klctfd24lze6-node-0 <none> <none> The below shows the status of the members in the pool as replica3: | 834750fe-e43e-408d-abc3-aad3dcde0fdb | member_0_node-0 | id | ACTIVE | 192.168.1.75 | 32054 | ONLINE | 1 | | 1ddffd80-acae-40b3-a2de-19be0a69a039 | member_0_node-2 | id | ACTIVE | 192.168.1.90 | 32054 | ERROR | 1 | | d4e4baa4-0a69-4775-8ea0-165a207f11ae | member_0_node-1| id | ACTIVE | 192.168.1.148 | 32054 | ONLINE | 1 | In fact to have the deployment spread across all 3 nodes, I had to increase the replicas until all 3 nodes had at least an instance of the ingress controller running on them (in this case it was replica 5). I do not believe this as being an Octavia issue as the health check is being done via TCP port number which is the NodePort exposed by Kubernetes and if the ingress-controller is not running on that node the port check fails, I added the label octavia mainly to get some input that may confirm the correct behavior of Octavia I am expecting to receive a healthy state when i check the members of the pool since I can query the ClusterIP from any worker node on ports 80 and 443 and the outcome is always successful but not when using the NodePort Thanks in advance
participants (1)
-
Luke Camilleri