[openstack-ansible] Check current state of Keystone DB: HAProxy galera-clustercheck fails if external_vip is used

Philipp Wörner philipp.woerner at dhbw-mannheim.de
Fri Feb 12 12:31:13 UTC 2021


Dear all,

 

unfortunately we are facing an problem while setting up Openstack (described
at the end of the mail).

Last year we had no issue with our configuration.

 

I found the problem why the playbook stops and a temporary workaround.

But I don’t know the root cause, maybe you can help me with some advice.

 

Thank you in advance!

 

Have a sunny weekend and best regards,

Philipp

 

 

 

Where setup-openstack.yml stops:

 

TASK [os_keystone : Check current state of Keystone DB]
****************************************************************************
****************************************************************************
******************************

fatal: [infra1_keystone_container-01c233df]: FAILED! => {"changed": true,
"cmd": ["/openstack/venvs/keystone-21.2.3.dev4/bin/keystone-manage",
"db_sync", "--check"], "delta": "0:01:42.166790", "end": "2021-02-11
13:01:06.282388", "failed_when_result": true, "msg": "non-zero return code",
"rc": 1, "start": "2021-02-11 12:59:24.115598", "stderr": "",
"stderr_lines": [], "stdout": "", "stdout_lines": []}

 

 

Reason:

This is caused because HAProxy thinks the service is not available
 caused
by the health check against port 9200.

 

If I test the cluster-check-service manually, I see the external_vip isn’t
allowed.

 

root at bc1bl11:/home/ubuntu# telnet -b <internal_vip> 192.168.110.235 9200

Trying 192.168.110.235...

Connected to 192.168.110.235.

Escape character is '^]'.

HTTP/1.1 200 OK

Content-Type: text/plain

Connection: close

Content-Length: 40

 

Percona XtraDB Cluster Node is synced.

Connection closed by foreign host.

 

root at bc1bl11:/home/ubuntu# telnet -b <external_vip> 192.168.110.235 9200

Trying 192.168.110.235...

Connected to 192.168.110.235.

Escape character is '^]'.

Connection closed by foreign host.

 

Workaround:

After manually modifying the service-configuration and adding the
external_vip to the whitelist everything works and the OSA-playbook succeeds
as well:

 

root at infra1-galera-container-492e1206:/# cat /etc/xinetd.d/mysqlchk 

# default: on

# description: mysqlchk

# Ansible managed

service mysqlchk

{

        disable = no

        flags = REUSE

        socket_type = stream

        port = 9200

        wait = no

        user = nobody

        server = /usr/local/bin/clustercheck

        log_on_failure += USERID

                only_from = 192.168.110.200 192.168.110.235 192.168.110.211
127.0.0.1

                per_source = UNLIMITED

}

 

Question:

I am wondering now why haproxy uses the external_vip to check the
mysql-service and why I am facing this problem now
 because last year
everything was fine with our configuration.

We just moved the external_vip from the NIC to the bridge in the
netplan-config and the external_vip is now in the same network as the
internal_vip.

 

 

 

Here is also a snipped of our haproxy config:

 

root at bc1bl11:/home/ubuntu# cat /etc/haproxy/haproxy.cfg 

# Ansible managed

 

global

        log /dev/log local0

        chroot /var/lib/haproxy

        user haproxy

        group haproxy

        daemon

        maxconn 4096

        stats socket /var/run/haproxy.stat level admin mode 600

                ssl-default-bind-options force-tlsv12

        tune.ssl.default-dh-param 2048

        

defaults

        log global

        option dontlognull

        option redispatch

        option forceclose

        retries 3

        timeout client 50s

        timeout connect 10s

        timeout http-request 5s

        timeout server 50s

        maxconn 4096

 




 

frontend galera-front-1

    bind 192.168.110.211:3306 

    option tcplog

    timeout client 5000s

    acl white_list src 127.0.0.1/8 192.168.0.0/16 172.16.0.0/12 10.0.0.0/8

    tcp-request content accept if white_list

    tcp-request content reject

    mode tcp

    default_backend galera-back

 

 

backend galera-back

    mode tcp

    balance leastconn

    timeout server 5000s

    stick store-request src

    stick-table type ip size 256k expire 30m

    option tcplog

    option httpchk HEAD / HTTP/1.0\r\nUser-agent:\ osa-haproxy-healthcheck

 

   # server infra1_galera_container-492e1206 192.168.110.235:3306

   server infra1_galera_container-492e1206 192.168.110.235:3306 check port
9200 inter 12000 rise 1 fall 1

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210212/356e474b/attachment.html>


More information about the openstack-discuss mailing list