[kolla-ansible][mariadb]error (galera cluster problem ?)

Laurent Dumont laurentfdumont at gmail.com
Sat Feb 5 16:08:02 UTC 2022


   - Any chance to revert back switches + server? That would indicate that
   MTU was the issue.
   - Dont ping the iscsi bay, ping between the controllers in Openstack,
   they are the ones running mariadb/galera.
   - Since the icmp packets are small, it might not trigger the MTU issues.
   Can you try something like "ping -s 8972 -M do -c 4 $mariadb_host_2" from $
   mariadb_host_1?
   - What is your network setup on the servers? Two ports in a bond? Did
   you change both physical interface MTU + bond interface itself?


4 minutes to create a 20GB empty volume seems too long to me. For an actual
20GB image, it's going to depend on the speed of the backing storage tech.

On Sat, Feb 5, 2022 at 1:51 AM Franck VEDEL <
franck.vedel at univ-grenoble-alpes.fr> wrote:

> Thanks for your help.
>
>
>
>    - What was the starting value for MTU?
>
> 1500
>
>
>    - What was the starting value changed to for MTU?
>
> 9000
>
>
>    - Can ping between all your controllers?
>
> yes, all container starts except nova-conductor, nova-scheduler, maraidb
>
>
>
>    - Do you just have two controllers running mariadb?
>
> yes
>
>
>    - How did you change MTU?
>
>
> On the 3 servers:
>
> nmcli connection modify team0-port1 802-3-ethernet.mtu 9000
> nmcli connection modify team1-port2 802-3-ethernet.mtu 9000
>
> nmcli connection modify type team0 team.runner lack ethernet.mtu 9000
>
> nmcli con down team0
>
> nmcli con down team1
>
>
>
>
>    - Was the change reverted at the network level as well (switches need
>    to be configured higher or at the same MTU value then the servers)
>
> I didn’t change Mtu on network (switches) , but ping -s 10.0.5.117 (iscsi
> bay) was working from serv3.
>
> I changed the value of the mtu because the creation of the volumes takes a
> lot of time I find (4 minutes for 20G, which is too long for what I want to
> do, the patience of the students decreases with the years)
>
> Franck
>
> Le 4 févr. 2022 à 23:12, Laurent Dumont <laurentfdumont at gmail.com> a
> écrit :
>
>
>    - What was the starting value for MTU?
>    - What was the starting value changed to for MTU?
>    - Can ping between all your controllers?
>    - Do you just have two controllers running mariadb?
>    - How did you change MTU?
>    - Was the change reverted at the network level as well (switches need
>    to be configured higher or at the same MTU value then the servers)
>
> 4567 seems to be the port for galera (clustering for mariadb)
>
> On Fri, Feb 4, 2022 at 11:52 AM Franck VEDEL <
> franck.vedel at univ-grenoble-alpes.fr> wrote:
>
>> Hello,
>> I am in an emergency situation, quite catastrophic situation because I do
>> not know what to do.
>>
>> I have an Openstack cluster with 3 servers (serv1, serv2, serv3). He was
>> doing so well…
>>
>>
>> A network admin came to me and told me to change an MTU on the cards. I
>> knew it shouldn't be done...I shouldn't have done it.
>> I did it.
>> Of course, it didn't work as expected. I went back to my starting
>> configuration and there I have a big problem with mariadb which is set up
>> on serv1 and serv2.
>>
>> Here are my errors:
>>
>>
>> 2022-02-04 17:40:36 0 [ERROR] WSREP: failed to open gcomm backend
>> connection: 110: failed to reach primary view: 110 (Connection timed out)
>>  at gcomm/src/pc.cpp:connect():160
>> 2022-02-04 17:40:36 0 [ERROR] WSREP:
>> gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend
>> connection: -110 (Connection timed out)
>> 2022-02-04 17:40:36 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1475:
>> Failed to open channel 'openstack' at '
>> gcomm://10.0.5.109:4567,10.0.5.110:4567': -110 (Connection timed out)
>> 2022-02-04 17:40:36 0 [ERROR] WSREP: gcs connect failed: Connection timed
>> out
>> 2022-02-04 17:40:36 0 [ERROR] WSREP: wsrep::connect(
>> gcomm://10.0.5.109:4567,10.0.5.110:4567) failed: 7
>> 2022-02-04 17:40:36 0 [ERROR] Aborting
>>
>>
>>
>>
>> I do not know what to do. My installation is done with kolla-ansible,
>> mariadb docker restarts every 30 seconds.
>>
>> Can the "kolla-ansible reconfigure mariadb" command be a solution?
>> Could the command "kolla-ansible mariadb recovery" be a solution?
>>
>> Thanks in advance if you can help me.
>>
>>
>>
>> Franck
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220205/309cc045/attachment.htm>


More information about the openstack-discuss mailing list