Why the mariadb container always restart ?
Tommy Sway
sz_cuitao at 163.com
Sat Aug 28 06:59:28 UTC 2021
It looks effect.
But cannot access mariadb from the vip address, should I restart which container ?
TASK [mariadb : Wait for master mariadb] **********************************************************************************************************************************************
skipping: [control02]
skipping: [control03]
FAILED - RETRYING: Wait for master mariadb (10 retries left).
ok: [control01]
TASK [mariadb : Wait for MariaDB service to be ready through VIP] *********************************************************************************************************************
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (6 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (5 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (4 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (3 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (2 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
fatal: [control02]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:01.409807", "end": "2021-08-28 14:57:14.332713", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:12.922906", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}
FAILED - RETRYING: Wait for MariaDB service to be ready through VIP (1 retries left).
fatal: [control01]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:03.553486", "end": "2021-08-28 14:57:29.130631", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:25.577145", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}
fatal: [control03]: FAILED! => {"attempts": 6, "changed": false, "cmd": ["docker", "exec", "mariadb", "mysql", "-h", "10.10.10.254", "-P", "3306", "-u", "root", "-pAMDGL9CThcBlIsJZyS6VZKLwqvz0BIGbj5PC00Lf", "-e", "show databases;"], "delta": "0:00:03.549868", "end": "2021-08-28 14:57:30.885324", "msg": "non-zero return code", "rc": 1, "start": "2021-08-28 14:57:27.335456", "stderr": "ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to MySQL server on '10.10.10.254' (115)"], "stdout": "", "stdout_lines": []}
PLAY RECAP ****************************************************************************************************************************************************************************
control01 : ok=30 changed=8 unreachable=0 failed=1 skipped=16 rescued=0 ignored=0
control02 : ok=24 changed=5 unreachable=0 failed=1 skipped=21 rescued=0 ignored=0
control03 : ok=24 changed=5 unreachable=0 failed=1 skipped=21 rescued=0 ignored=0
Command failed ansible-playbook -i ./multinode -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla -e kolla_action=deploy /venv/share/kolla-ansible/ansible/mariadb_recovery.yml
[root at control02 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c2d0521d9833 10.10.10.113:4000/kolla/centos-binary-mariadb-server:wallaby "dumb-init -- kolla_…" 4 minutes ago Up 4 minutes mariadb
7f4038b89518 10.10.10.113:4000/kolla/centos-binary-horizon:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) horizon
82f8e756b5da 10.10.10.113:4000/kolla/centos-binary-heat-engine:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) heat_engine
124d292e17d9 10.10.10.113:4000/kolla/centos-binary-heat-api-cfn:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) heat_api_cfn
75f6bebed35a 10.10.10.113:4000/kolla/centos-binary-heat-api:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) heat_api
c8fc3fc49fe0 10.10.10.113:4000/kolla/centos-binary-neutron-server:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (unhealthy) neutron_server
1ed052094fde 10.10.10.113:4000/kolla/centos-binary-nova-novncproxy:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) nova_novncproxy
b25403743c4b 10.10.10.113:4000/kolla/centos-binary-nova-conductor:wallaby "dumb-init --single-…" 2 days ago Up 1 second (health: starting) nova_conductor
f150ff15e53a 10.10.10.113:4000/kolla/centos-binary-nova-api:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (unhealthy) nova_api
c71b1718c4d8 10.10.10.113:4000/kolla/centos-binary-nova-scheduler:wallaby "dumb-init --single-…" 2 days ago Up 1 second (health: starting) nova_scheduler
8a5d43ac62ca 10.10.10.113:4000/kolla/centos-binary-placement-api:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (unhealthy) placement_api
f0c142d683bf 10.10.10.113:4000/kolla/centos-binary-keystone:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) keystone
6decd0fb670c 10.10.10.113:4000/kolla/centos-binary-keystone-fernet:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes keystone_fernet
b2b8b14c114b 10.10.10.113:4000/kolla/centos-binary-keystone-ssh:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes (healthy) keystone_ssh
f119c52904f9 10.10.10.113:4000/kolla/centos-binary-rabbitmq:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes rabbitmq
e57493e8a877 10.10.10.113:4000/kolla/centos-binary-memcached:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes memcached
bcb4f5a0f4a9 10.10.10.113:4000/kolla/centos-binary-mariadb-clustercheck:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes mariadb_clustercheck
6b7ffe32799c 10.10.10.113:4000/kolla/centos-binary-cron:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes cron
ccd2b7c0d212 10.10.10.113:4000/kolla/centos-binary-kolla-toolbox:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes kolla_toolbox
cf4ec99b9c59 10.10.10.113:4000/kolla/centos-binary-fluentd:wallaby "dumb-init --single-…" 2 days ago Up 10 minutes fluentd
-----Original Message-----
From: Radosław Piliszek <radoslaw.piliszek at gmail.com>
Sent: Saturday, August 28, 2021 1:06 AM
To: Tommy Sway <sz_cuitao at 163.com>
Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: Why the mariadb container always restart ?
On Fri, Aug 27, 2021 at 6:56 PM Tommy Sway <sz_cuitao at 163.com> wrote:
>
> Hi:
>
>
>
> The system is broken down for the power, and I restart the whole system, but the mariadb container always restart, about one minite restart once.
>
>
>
> Ant this is the log :
>
>
>
>
>
> 2021-08-28 0:34:51 0 [Note] WSREP: (a8e9005b,
> 'tcp://10.10.10.63:4567') turning message relay requesting off
>
> 2021-08-28 0:35:03 0 [ERROR] WSREP: failed to open gcomm backend
> connection: 110: failed to reach primary view: 110 (Connection timed
> out)
>
> at gcomm/src/pc.cpp:connect():160
>
> 2021-08-28 0:35:03 0 [ERROR] WSREP:
> gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend
> connection: -110 (Connection timed out)
>
> 2021-08-28 0:35:03 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1475:
> Failed to open channel 'openstack' at
> 'gcomm://10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4567': -110
> (Connection timed out)
>
> 2021-08-28 0:35:03 0 [ERROR] WSREP: gcs connect failed: Connection
> timed out
>
> 2021-08-28 0:35:03 0 [ERROR] WSREP:
> wsrep::connect(gcomm://10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4
> 567) failed: 7
>
> 2021-08-28 0:35:03 0 [ERROR] Aborting
>
>
>
> 210828 00:35:05 mysqld_safe mysqld from pid file
> /var/lib/mysql/mariadb.pid ended
>
> 210828 00:35:08 mysqld_safe Starting mysqld daemon with databases from
> /var/lib/mysql/
>
> 210828 00:35:08 mysqld_safe WSREP: Running position recovery with --disable-log-error --pid-file='/var/lib/mysql//control03-recover.pid'
>
> 210828 00:35:12 mysqld_safe WSREP: Recovered position
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Read nil XID from storage engines,
> skipping position init
>
> 2021-08-28 0:35:12 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
>
> 2021-08-28 0:35:12 0 [Note] /usr/libexec/mysqld (mysqld 10.3.28-MariaDB-log) starting as process 258 ...
>
> 2021-08-28 0:35:12 0 [Note] WSREP: wsrep_load(): Galera 3.32(rXXXX) by Codership Oy info at codership.com loaded successfully.
>
> 2021-08-28 0:35:12 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Found saved state:
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:-1, safe_to_bootstrap: 1
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Passing config to GCS: base_dir =
> /var/lib/mysql/; base_host = 10.10.10.63; base_port = 4567;
> cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no;
> evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period =
> PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout =
> PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3;
> evs.send_window = 4; evs.stats_report_period = PT1M;
> evs.suspect_timeout = PT5S; evs.user_send_window = 2;
> evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/;
> gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name =
> /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover
> = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0;
> gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no;
> gcs.max_packet_size = 64500; gcs.max_throttle = 0.25;
> gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit =
> 0.25; gcs.sync_donor = no; gmcast.listen_addr =
> tcp://10.10.10.63:4567; gmcast.segment = 0; gmc
>
> 2021-08-28 0:35:12 0 [Note] WSREP: GCache history reset:
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:0 ->
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Assign initial position for
> certification: 140403, protocol version: -1
>
> 2021-08-28 0:35:12 0 [Note] WSREP: wsrep_sst_grab()
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Start replication
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Setting initial position to
> 5bdd8d83-05a4-11ec-adfd-ae6e7e26deb9:140403
>
> 2021-08-28 0:35:12 0 [Note] WSREP: protonet asio version 0
>
> 2021-08-28 0:35:12 0 [Note] WSREP: Using CRC-32C for message checksums.
>
> 2021-08-28 0:35:12 0 [Note] WSREP: backend: asio
>
> 2021-08-28 0:35:12 0 [Note] WSREP: gcomm thread scheduling priority
> set to other:0
>
> 2021-08-28 0:35:12 0 [Warning] WSREP: access
> file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
>
> 2021-08-28 0:35:12 0 [Note] WSREP: restore pc from disk failed
>
> 2021-08-28 0:35:12 0 [Note] WSREP: GMCast version 0
>
> 2021-08-28 0:35:12 0 [Note] WSREP: (c05f5c52,
> 'tcp://10.10.10.63:4567') listening at tcp://10.10.10.63:4567
>
> 2021-08-28 0:35:12 0 [Note] WSREP: (c05f5c52,
> 'tcp://10.10.10.63:4567') multicast: , ttl: 1
>
> 2021-08-28 0:35:12 0 [Note] WSREP: EVS version 0
>
> 2021-08-28 0:35:12 0 [Note] WSREP: gcomm: connecting to group 'openstack', peer '10.10.10.61:4567,10.10.10.62:4567,10.10.10.63:4567'
>
> 2021-08-28 0:35:12 0 [Note] WSREP: (c05f5c52,
> 'tcp://10.10.10.63:4567') connection established to b1ae23e3
> tcp://10.10.10.62:4567
>
> 2021-08-28 0:35:12 0 [Note] WSREP: (c05f5c52, 'tcp://10.10.10.63:4567') turning message relay requesting on, nonlive peers:
>
> 2021-08-28 0:35:16 0 [Note] WSREP: (c05f5c52,
> 'tcp://10.10.10.63:4567') turning message relay requesting off
>
> 2021-08-28 0:35:16 0 [Note] WSREP: (c05f5c52,
> 'tcp://10.10.10.63:4567') connection established to c296912b
> tcp://10.10.10.61:4567
>
> 2021-08-28 0:35:16 0 [Note] WSREP: (c05f5c52, 'tcp://10.10.10.63:4567') turning message relay requesting on, nonlive peers:
>
> 2021-08-28 0:35:18 0 [Note] WSREP: declaring b1ae23e3 at
> tcp://10.10.10.62:4567 stable
>
> 2021-08-28 0:35:18 0 [Note] WSREP: declaring c296912b at
> tcp://10.10.10.61:4567 stable
>
> 2021-08-28 0:35:19 0 [Warning] WSREP: no nodes coming from prim view,
> prim not possible
>
>
>
>
>
> What’s matter with it ?
>
>
After lights-out, you have to run ``kolla-ansible mariadb_recovery`` to safely recover the Galera cluster state.
-yoctozepto
More information about the openstack-discuss
mailing list