Why the mariadb container always restart ?

Tommy Sway sz_cuitao at 163.com
Sat Aug 28 06:59:28 UTC 2021


It looks effect.

But cannot access mariadb from the vip address, should I restart which container ?




TASK [mariadb : Wait for master mariadb] **********************************************************************************************************************************************
skipping: [control02]
skipping: [control03]
FAILED - RETRYING: Wait for master mariadb (10 retries left).
ok: [control01]

TASK [mariadb : Wait for MariaDB service to be ready through VIP] *********************************************************************************************************************
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
fatal: [control02]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:01.409807", "end": "2021-08-28 14:57:14.332713", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:12.922906", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
fatal: [control01]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:03.553486", "end": "2021-08-28 14:57:29.130631", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:25.577145", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}
fatal: [control03]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:03.549868", "end": "2021-08-28 14:57:30.885324", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:27.335456", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}

PLAY RECAP ****************************************************************************************************************************************************************************
control01                  : ok=30   changed=8    unreachable=0    failed=1    skipped=16   rescued=0    ignored=0
control02                  : ok=24   changed=5    unreachable=0    failed=1    skipped=21   rescued=0    ignored=0
control03                  : ok=24   changed=5    unreachable=0    failed=1    skipped=21   rescued=0    ignored=0

Command failed ansible-playbook -i ./multinode -e @/etc/kolla/globals.yml  -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla  -e kolla_action=deploy /venv/share/kolla-ansible/ansible/mariadb_recovery.yml




[root at control02 ~]# docker ps
CONTAINER ID   IMAGE                                                                COMMAND                  CREATED         STATUS                           PORTS     NAMES
c2d0521d9833   10.10.10.113:4000/kolla/centos-binary-mariadb-server:wallaby         "dumb-init -- kolla_…"   4 minutes ago   Up 4 minutes                               mariadb
7f4038b89518   10.10.10.113:4000/kolla/centos-binary-horizon:wallaby                "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    horizon
82f8e756b5da   10.10.10.113:4000/kolla/centos-binary-heat-engine:wallaby            "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    heat_engine
124d292e17d9   10.10.10.113:4000/kolla/centos-binary-heat-api-cfn:wallaby           "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    heat_api_cfn
75f6bebed35a   10.10.10.113:4000/kolla/centos-binary-heat-api:wallaby               "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    heat_api
c8fc3fc49fe0   10.10.10.113:4000/kolla/centos-binary-neutron-server:wallaby         "dumb-init --single-…"   2 days ago      Up 10 minutes (unhealthy)                  neutron_server
1ed052094fde   10.10.10.113:4000/kolla/centos-binary-nova-novncproxy:wallaby        "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    nova_novncproxy
b25403743c4b   10.10.10.113:4000/kolla/centos-binary-nova-conductor:wallaby         "dumb-init --single-…"   2 days ago      Up 1 second (health: starting)             nova_conductor
f150ff15e53a   10.10.10.113:4000/kolla/centos-binary-nova-api:wallaby               "dumb-init --single-…"   2 days ago      Up 10 minutes (unhealthy)                  nova_api
c71b1718c4d8   10.10.10.113:4000/kolla/centos-binary-nova-scheduler:wallaby         "dumb-init --single-…"   2 days ago      Up 1 second (health: starting)             nova_scheduler
8a5d43ac62ca   10.10.10.113:4000/kolla/centos-binary-placement-api:wallaby          "dumb-init --single-…"   2 days ago      Up 10 minutes (unhealthy)                  placement_api
f0c142d683bf   10.10.10.113:4000/kolla/centos-binary-keystone:wallaby               "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    keystone
6decd0fb670c   10.10.10.113:4000/kolla/centos-binary-keystone-fernet:wallaby        "dumb-init --single-…"   2 days ago      Up 10 minutes                              keystone_fernet
b2b8b14c114b   10.10.10.113:4000/kolla/centos-binary-keystone-ssh:wallaby           "dumb-init --single-…"   2 days ago      Up 10 minutes (healthy)                    keystone_ssh
f119c52904f9   10.10.10.113:4000/kolla/centos-binary-rabbitmq:wallaby               "dumb-init --single-…"   2 days ago      Up 10 minutes                              rabbitmq
e57493e8a877   10.10.10.113:4000/kolla/centos-binary-memcached:wallaby              "dumb-init --single-…"   2 days ago      Up 10 minutes                              memcached
bcb4f5a0f4a9   10.10.10.113:4000/kolla/centos-binary-mariadb-clustercheck:wallaby   "dumb-init --single-…"   2 days ago      Up 10 minutes                              mariadb_clustercheck
6b7ffe32799c   10.10.10.113:4000/kolla/centos-binary-cron:wallaby                   "dumb-init --single-…"   2 days ago      Up 10 minutes                              cron
ccd2b7c0d212   10.10.10.113:4000/kolla/centos-binary-kolla-toolbox:wallaby          "dumb-init --single-…"   2 days ago      Up 10 minutes                              kolla_toolbox
cf4ec99b9c59   10.10.10.113:4000/kolla/centos-binary-fluentd:wallaby                "dumb-init --single-…"   2 days ago      Up 10 minutes                              fluentd






-----Original Message-----
From: Radosław Piliszek <radoslaw.piliszek at gmail.com> 
Sent: Saturday, August 28, 2021 1:06 AM
To: Tommy Sway <sz_cuitao at 163.com>
Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: Why the mariadb container always restart ?

On Fri, Aug 27, 2021 at 6:56 PM Tommy Sway <sz_cuitao at 163.com> wrote:
>
> Hi:
>
>
>
> The system is broken down for the power, and I restart the whole system, but the mariadb container always restart, about one minite restart once.
>
>
>
> Ant this is the log :
>
>
>
>
>
> 2021-08-28  0:34:51 0 [Note] WSREP: (a8e9005b, 
> 'tcp://10.10.10.63:4567') turning message relay requesting off
>
> 2021-08-28  0:35:03 0 [ERROR] WSREP: failed to open gcomm backend 
> connection: 110: failed to reach primary view: 110 (Connection timed 
> out)
>
>          at gcomm/src/pc.cpp:connect():160
>
> 2021-08-28  0:35:03 0 [ERROR] WSREP: 
> gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend 
> connection: -110 (Connection timed out)
>
> 2021-08-28  0:35:03 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1475: 
> Failed to open channel 'openstack' at 
> 'gcomm://10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4567': -110 
> (Connection timed out)
>
> 2021-08-28  0:35:03 0 [ERROR] WSREP: gcs connect failed: Connection 
> timed out
>
> 2021-08-28  0:35:03 0 [ERROR] WSREP: 
> wsrep::connect(gcomm://10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4
> 567) failed: 7
>
> 2021-08-28  0:35:03 0 [ERROR] Aborting
>
>
>
> 210828 00:35:05 mysqld_safe mysqld from pid file 
> /var/lib/mysql/mariadb.pid ended
>
> 210828 00:35:08 mysqld_safe Starting mysqld daemon with databases from 
> /var/lib/mysql/
>
> 210828 00:35:08 mysqld_safe WSREP: Running position recovery with --disable-log-error  --pid-file='/var/lib/mysql//control03-recover.pid'
>
> 210828 00:35:12 mysqld_safe WSREP: Recovered position 
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Read nil XID from storage engines, 
> skipping position init
>
> 2021-08-28  0:35:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
>
> 2021-08-28  0:35:12 0 [Note] /usr/libexec/mysqld (mysqld 10.3.28-MariaDB-log) starting as process 258 ...
>
> 2021-08-28  0:35:12 0 [Note] WSREP: wsrep_load(): Galera 3.32(rXXXX) by Codership Oy info at codership.com loaded successfully.
>
> 2021-08-28  0:35:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Found saved state: 
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:-1, safe_to_bootstrap: 1
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Passing config to GCS: base_dir = 
> /var/lib/mysql/; base_host = 10.10.10.63; base_port = 4567; 
> cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; 
> evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = 
> PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = 
> PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; 
> evs.send_window = 4; evs.stats_report_period = PT1M; 
> evs.suspect_timeout = PT5S; evs.user_send_window = 2; 
> evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; 
> gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = 
> /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover 
> = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; 
> gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; 
> gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; 
> gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 
> 0.25; gcs.sync_donor = no; gmcast.listen_addr = 
> tcp://10.10.10.63:4567; gmcast.segment = 0; gmc
>
> 2021-08-28  0:35:12 0 [Note] WSREP: GCache history reset: 
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:0 -> 
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Assign initial position for 
> certification: 140403, protocol version: -1
>
> 2021-08-28  0:35:12 0 [Note] WSREP: wsrep_sst_grab()
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Start replication
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Setting initial position to 
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28  0:35:12 0 [Note] WSREP: protonet asio version 0
>
> 2021-08-28  0:35:12 0 [Note] WSREP: Using CRC-32C for message checksums.
>
> 2021-08-28  0:35:12 0 [Note] WSREP: backend: asio
>
> 2021-08-28  0:35:12 0 [Note] WSREP: gcomm thread scheduling priority 
> set to other:0
>
> 2021-08-28  0:35:12 0 [Warning] WSREP: access 
> file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
>
> 2021-08-28  0:35:12 0 [Note] WSREP: restore pc from disk failed
>
> 2021-08-28  0:35:12 0 [Note] WSREP: GMCast version 0
>
> 2021-08-28  0:35:12 0 [Note] WSREP: (c05f5c52, 
> 'tcp://10.10.10.63:4567') listening at tcp://10.10.10.63:4567
>
> 2021-08-28  0:35:12 0 [Note] WSREP: (c05f5c52, 
> 'tcp://10.10.10.63:4567') multicast: , ttl: 1
>
> 2021-08-28  0:35:12 0 [Note] WSREP: EVS version 0
>
> 2021-08-28  0:35:12 0 [Note] WSREP: gcomm: connecting to group 'openstack', peer '10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4567'
>
> 2021-08-28  0:35:12 0 [Note] WSREP: (c05f5c52, 
> 'tcp://10.10.10.63:4567') connection established to b1ae23e3 
> tcp://10.10.10.62:4567
>
> 2021-08-28  0:35:12 0 [Note] WSREP: (c05f5c52, 'tcp://10.10.10.63:4567') turning message relay requesting on, nonlive peers:
>
> 2021-08-28  0:35:16 0 [Note] WSREP: (c05f5c52, 
> 'tcp://10.10.10.63:4567') turning message relay requesting off
>
> 2021-08-28  0:35:16 0 [Note] WSREP: (c05f5c52, 
> 'tcp://10.10.10.63:4567') connection established to c296912b 
> tcp://10.10.10.61:4567
>
> 2021-08-28  0:35:16 0 [Note] WSREP: (c05f5c52, 'tcp://10.10.10.63:4567') turning message relay requesting on, nonlive peers:
>
> 2021-08-28  0:35:18 0 [Note] WSREP: declaring b1ae23e3 at 
> tcp://10.10.10.62:4567 stable
>
> 2021-08-28  0:35:18 0 [Note] WSREP: declaring c296912b at 
> tcp://10.10.10.61:4567 stable
>
> 2021-08-28  0:35:19 0 [Warning] WSREP: no nodes coming from prim view, 
> prim not possible
>
>
>
>
>
> What’s matter with it ?
>
>

After lights-out, you have to run ``kolla-ansible mariadb_recovery`` to safely recover the Galera cluster state.

-yoctozepto





More information about the openstack-discuss mailing list