Hi Luke,
From the snippet you provided, it looks like the load balancer is healthy and working as expected, but the health check of the member endpoint is failing.
I also see that the health monitor is configured for a type of TCP and the members are configured for port 31765. This assumes that the members do not have "monitor_address" or "monitor_port" configured to override the health check endpoint. One thing to check is the subnet_id that was configured on the members in Octavia. The members (192.168.1.99:31765, etc.) must be reachable from that subnet_id. If no subnet_id was used when creating the members, it will use the VIP subnet of the load balancer. Sometimes users forget to specify the subnet_id, that a member should be reachable from, when creating the member on the load balancer. Michael On Mon, May 24, 2021 at 1:48 PM Luke Camilleri <luke.camilleri@zylacomputing.com> wrote:
I have configured magnum with a service type of LoadBalancer and it is successfully deploying an external LoadBalancer via Octavia. The problem is that I cannot reach the public IP address but can see the entries in the haproxy.log on the amphora instance and the log shows <NOSRV> 0 SC-- at the end of each entry when it is being accessed (via a browser for example).
So the Octavia part seems to be fine, the config shows the correct LB --> listener --> pool --> member and the nodeports that the service should be listening on (I am assuming that the same nodeports are also used for the healthchecks)
The haproxy.cfg in the amphora instance shows the below in the pool members section:
server 43834d2f-4e22-4065-b448-ddf0713f2ced 192.168.1.191:31765 weight 1 check inter 60s fall 3 rise 3 server 3a733c48-24dd-426e-8394-699a908121ee 192.168.1.36:31765 weight 1 check inter 60s fall 3 rise 3 server 8d093783-79c9-4094-b3a2-8d31b1c4567f 192.168.1.99:31765 weight 1 check inter 60s fall 3 rise 3
and the status of the pool's members (# openstack loadbalancer member list) is as follows:
+------------------------------------------------------------+---------------------------+-----------------+-----------------------+---------------------------+ | id | provisioning_status | address | protocol_port | operating_status |
8d093783-79c9-4094-b3a2-8d31b1c4567f ACTIVE 192.168.1.99 31765 ERROR 43834d2f-4e22-4065-b448-ddf0713f2ced ACTIVE 192.168.1.191 31765 ERROR 3a733c48-24dd-426e-8394-699a908121ee ACTIVE 192.168.1.36 31765 ERROR
From the below loadbalancer healthmonitor show command I can see that the health checks are being done via TCP on the same port and can confirm that the security group allows the nodeports range (ALLOW IPv4 30000-32767/tcp from 0.0.0.0/0)
+---------------------+--------------------------------------+ | Field | Value +---------------------+--------------------------------------+ | provisioning_status | ACTIVE | type | TCP | id | 86604638-27db-47d2-ad9c-0594564a44be | operating_status | ONLINE +---------------------+--------------------------------------+
$ kubectl describe service nginx-service Name: nginx-service Namespace: default Labels: <none> Annotations: <none> Selector: app=nginx Type: LoadBalancer IP Families: <none> IP: 10.254.255.232 IPs: <none> LoadBalancer Ingress: 185.89.239.217 Port: <unset> 80/TCP TargetPort: 8080/TCP NodePort: <unset> 31765/TCP Endpoints: 10.100.4.11:8080,10.100.4.12:8080,10.100.5.10:8080 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 29m service-controller Ensuring load balancer Normal EnsuredLoadBalancer 28m service-controller Ensured load balancer
The Kubernetes deployment file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx type: LoadBalancer ports: - protocol: TCP port: 80 targetPort: 8080
Does ayone have any pointers as to why the amphora is not able to reach the nodeports of the kubernetes workers?
Thanks in advance