[victoria][magnum][octavia]<NOSRV> in amphora
I have configured magnum with a service type of LoadBalancer and it is successfully deploying an external LoadBalancer via Octavia. The problem is that I cannot reach the public IP address but can see the entries in the haproxy.log on the amphora instance and the log shows <NOSRV> 0 SC-- at the end of each entry when it is being accessed (via a browser for example). So the Octavia part seems to be fine, the config shows the correct LB --> listener --> pool --> member and the nodeports that the service should be listening on (I am assuming that the same nodeports are also used for the healthchecks) The haproxy.cfg in the amphora instance shows the below in the pool members section: server 43834d2f-4e22-4065-b448-ddf0713f2ced 192.168.1.191:31765 weight 1 check inter 60s fall 3 rise 3 server 3a733c48-24dd-426e-8394-699a908121ee 192.168.1.36:31765 weight 1 check inter 60s fall 3 rise 3 server 8d093783-79c9-4094-b3a2-8d31b1c4567f 192.168.1.99:31765 weight 1 check inter 60s fall 3 rise 3 and the status of the pool's members (# openstack loadbalancer member list) is as follows: +------------------------------------------------------------+---------------------------+-----------------+-----------------------+---------------------------+ | id | provisioning_status | address | protocol_port | operating_status | 8d093783-79c9-4094-b3a2-8d31b1c4567f ACTIVE 192.168.1.99 31765 ERROR 43834d2f-4e22-4065-b448-ddf0713f2ced ACTIVE 192.168.1.191 31765 ERROR 3a733c48-24dd-426e-8394-699a908121ee ACTIVE 192.168.1.36 31765 ERROR From the below loadbalancer healthmonitor show command I can see that the health checks are being done via TCP on the same port and can confirm that the security group allows the nodeports range (ALLOW IPv4 30000-32767/tcp from 0.0.0.0/0) +---------------------+--------------------------------------+ | Field | Value +---------------------+--------------------------------------+ | provisioning_status | ACTIVE | type | TCP | id | 86604638-27db-47d2-ad9c-0594564a44be | operating_status | ONLINE +---------------------+--------------------------------------+ $ kubectl describe service nginx-service Name: nginx-service Namespace: default Labels: <none> Annotations: <none> Selector: app=nginx Type: LoadBalancer IP Families: <none> IP: 10.254.255.232 IPs: <none> LoadBalancer Ingress: 185.89.239.217 Port: <unset> 80/TCP TargetPort: 8080/TCP NodePort: <unset> 31765/TCP Endpoints: 10.100.4.11:8080,10.100.4.12:8080,10.100.5.10:8080 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 29m service-controller Ensuring load balancer Normal EnsuredLoadBalancer 28m service-controller Ensured load balancer The Kubernetes deployment file: apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx type: LoadBalancer ports: - protocol: TCP port: 80 targetPort: 8080 Does ayone have any pointers as to why the amphora is not able to reach the nodeports of the kubernetes workers? Thanks in advance
Hi Luke,
From the snippet you provided, it looks like the load balancer is healthy and working as expected, but the health check of the member endpoint is failing.
I also see that the health monitor is configured for a type of TCP and the members are configured for port 31765. This assumes that the members do not have "monitor_address" or "monitor_port" configured to override the health check endpoint. One thing to check is the subnet_id that was configured on the members in Octavia. The members (192.168.1.99:31765, etc.) must be reachable from that subnet_id. If no subnet_id was used when creating the members, it will use the VIP subnet of the load balancer. Sometimes users forget to specify the subnet_id, that a member should be reachable from, when creating the member on the load balancer. Michael On Mon, May 24, 2021 at 1:48 PM Luke Camilleri <luke.camilleri@zylacomputing.com> wrote:
I have configured magnum with a service type of LoadBalancer and it is successfully deploying an external LoadBalancer via Octavia. The problem is that I cannot reach the public IP address but can see the entries in the haproxy.log on the amphora instance and the log shows <NOSRV> 0 SC-- at the end of each entry when it is being accessed (via a browser for example).
So the Octavia part seems to be fine, the config shows the correct LB --> listener --> pool --> member and the nodeports that the service should be listening on (I am assuming that the same nodeports are also used for the healthchecks)
The haproxy.cfg in the amphora instance shows the below in the pool members section:
server 43834d2f-4e22-4065-b448-ddf0713f2ced 192.168.1.191:31765 weight 1 check inter 60s fall 3 rise 3 server 3a733c48-24dd-426e-8394-699a908121ee 192.168.1.36:31765 weight 1 check inter 60s fall 3 rise 3 server 8d093783-79c9-4094-b3a2-8d31b1c4567f 192.168.1.99:31765 weight 1 check inter 60s fall 3 rise 3
and the status of the pool's members (# openstack loadbalancer member list) is as follows:
+------------------------------------------------------------+---------------------------+-----------------+-----------------------+---------------------------+ | id | provisioning_status | address | protocol_port | operating_status |
8d093783-79c9-4094-b3a2-8d31b1c4567f ACTIVE 192.168.1.99 31765 ERROR 43834d2f-4e22-4065-b448-ddf0713f2ced ACTIVE 192.168.1.191 31765 ERROR 3a733c48-24dd-426e-8394-699a908121ee ACTIVE 192.168.1.36 31765 ERROR
From the below loadbalancer healthmonitor show command I can see that the health checks are being done via TCP on the same port and can confirm that the security group allows the nodeports range (ALLOW IPv4 30000-32767/tcp from 0.0.0.0/0)
+---------------------+--------------------------------------+ | Field | Value +---------------------+--------------------------------------+ | provisioning_status | ACTIVE | type | TCP | id | 86604638-27db-47d2-ad9c-0594564a44be | operating_status | ONLINE +---------------------+--------------------------------------+
$ kubectl describe service nginx-service Name: nginx-service Namespace: default Labels: <none> Annotations: <none> Selector: app=nginx Type: LoadBalancer IP Families: <none> IP: 10.254.255.232 IPs: <none> LoadBalancer Ingress: 185.89.239.217 Port: <unset> 80/TCP TargetPort: 8080/TCP NodePort: <unset> 31765/TCP Endpoints: 10.100.4.11:8080,10.100.4.12:8080,10.100.5.10:8080 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 29m service-controller Ensuring load balancer Normal EnsuredLoadBalancer 28m service-controller Ensured load balancer
The Kubernetes deployment file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx type: LoadBalancer ports: - protocol: TCP port: 80 targetPort: 8080
Does ayone have any pointers as to why the amphora is not able to reach the nodeports of the kubernetes workers?
Thanks in advance
HI Michael and thanks for your reply. Below please find my answers: Yes the load balancer is healthy and the health check of the member endpoint in failing Yes, none of the members have monitor_address or monitor_port configured to override the health check endpoint (below details of one of the members) | Field | Value +---------------------+-------------------------------------------------------------------------------------------------------------------+ | address | 192.168.1.99 | admin_state_up | True | created_at | 2021-05-24T20:07:56 | id | 8d093783-79c9-4094-b3a2-8d31b1c4567f | name | member_0_k8s-c1-final-yrlns2q7qo2w-node-1_kube_service_f48f5572-0011-4b66-b9ab-63190d0c00b3_default_nginx-service | operating_status | ERROR | project_id | 35a0fa65de1741619709485c5f6d989b | protocol_port | 31765 | provisioning_status | ACTIVE | subnet_id | 19d73049-4bf7-455c-b278-3cba451102fb | updated_at | 2021-05-24T20:08:51 | weight | 1 | monitor_port | None | monitor_address | None | backup | False The subnet ID shown in the above details for the member node are the same and the subnet ID assigned to this project as shown below: | Field | Value | +----------------------+--------------------------------------+ | allocation_pools | 192.168.1.10-192.168.1.254 | cidr | 192.168.1.0/24 | created_at | 2021-03-24T20:14:03Z | description | | dns_nameservers | 8.8.4.4, 8.8.8.8 | dns_publish_fixed_ip | None | enable_dhcp | True | gateway_ip | 192.168.1.1 | host_routes | | id | 19d73049-4bf7-455c-b278-3cba451102fb | ip_version | 4 | ipv6_address_mode | None | ipv6_ra_mode | None | name | red-subnet-2 | network_id | 9d2e17df-a93f-4709-941d-a6e8f98f5556 | prefix_length | None | project_id | 35a0fa65de1741619709485c5f6d989b | revision_number | 1 | segment_id | None | service_types | | subnetpool_id | None | tags | updated_at | 2021-03-26T13:59:07Z I can SSH on an instance unrelated to the issue and from within SSH into all 3 members without any issues. The members are part of a Kubernetes cluster (created using Magnum) and both subnet and network have been specified during the cluster creation. the issue seems to be coming from the nodeport which is created by default and that should proxy the LoadBalancer requests to the clusterIP as I do not get any reply when querying the port with curl and since the health manager uses that port it is also failing I do not believe that the issue is actually from Octavia but from Magnum from what I can see and the nodeport functionality that gets created to have the amphora instance reach the nodeports Thanks in advance On 25/05/2021 02:12, Michael Johnson wrote:
Hi Luke,
From the snippet you provided, it looks like the load balancer is healthy and working as expected, but the health check of the member endpoint is failing.
I also see that the health monitor is configured for a type of TCP and the members are configured for port 31765. This assumes that the members do not have "monitor_address" or "monitor_port" configured to override the health check endpoint.
One thing to check is the subnet_id that was configured on the members in Octavia. The members (192.168.1.99:31765, etc.) must be reachable from that subnet_id. If no subnet_id was used when creating the members, it will use the VIP subnet of the load balancer.
Sometimes users forget to specify the subnet_id, that a member should be reachable from, when creating the member on the load balancer.
Michael
On Mon, May 24, 2021 at 1:48 PM Luke Camilleri <luke.camilleri@zylacomputing.com> wrote:
I have configured magnum with a service type of LoadBalancer and it is successfully deploying an external LoadBalancer via Octavia. The problem is that I cannot reach the public IP address but can see the entries in the haproxy.log on the amphora instance and the log shows <NOSRV> 0 SC-- at the end of each entry when it is being accessed (via a browser for example).
So the Octavia part seems to be fine, the config shows the correct LB --> listener --> pool --> member and the nodeports that the service should be listening on (I am assuming that the same nodeports are also used for the healthchecks)
The haproxy.cfg in the amphora instance shows the below in the pool members section:
server 43834d2f-4e22-4065-b448-ddf0713f2ced 192.168.1.191:31765 weight 1 check inter 60s fall 3 rise 3 server 3a733c48-24dd-426e-8394-699a908121ee 192.168.1.36:31765 weight 1 check inter 60s fall 3 rise 3 server 8d093783-79c9-4094-b3a2-8d31b1c4567f 192.168.1.99:31765 weight 1 check inter 60s fall 3 rise 3
and the status of the pool's members (# openstack loadbalancer member list) is as follows:
+------------------------------------------------------------+---------------------------+-----------------+-----------------------+---------------------------+ | id | provisioning_status | address | protocol_port | operating_status |
8d093783-79c9-4094-b3a2-8d31b1c4567f ACTIVE 192.168.1.99 31765 ERROR 43834d2f-4e22-4065-b448-ddf0713f2ced ACTIVE 192.168.1.191 31765 ERROR 3a733c48-24dd-426e-8394-699a908121ee ACTIVE 192.168.1.36 31765 ERROR
From the below loadbalancer healthmonitor show command I can see that the health checks are being done via TCP on the same port and can confirm that the security group allows the nodeports range (ALLOW IPv4 30000-32767/tcp from 0.0.0.0/0)
+---------------------+--------------------------------------+ | Field | Value +---------------------+--------------------------------------+ | provisioning_status | ACTIVE | type | TCP | id | 86604638-27db-47d2-ad9c-0594564a44be | operating_status | ONLINE +---------------------+--------------------------------------+
$ kubectl describe service nginx-service Name: nginx-service Namespace: default Labels: <none> Annotations: <none> Selector: app=nginx Type: LoadBalancer IP Families: <none> IP: 10.254.255.232 IPs: <none> LoadBalancer Ingress: 185.89.239.217 Port: <unset> 80/TCP TargetPort: 8080/TCP NodePort: <unset> 31765/TCP Endpoints: 10.100.4.11:8080,10.100.4.12:8080,10.100.5.10:8080 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 29m service-controller Ensuring load balancer Normal EnsuredLoadBalancer 28m service-controller Ensured load balancer
The Kubernetes deployment file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx type: LoadBalancer ports: - protocol: TCP port: 80 targetPort: 8080
Does ayone have any pointers as to why the amphora is not able to reach the nodeports of the kubernetes workers?
Thanks in advance
participants (2)
-
Luke Camilleri
-
Michael Johnson