Re: [kolla-ansible]Reset Configuration

12 Nov 2022

      On Sat, Nov 12, 2022 at 3:08 AM Franck VEDEL <
franck.vedel@univ-grenoble-alpes.fr> wrote:
...
Bonjour !
Output of the command
Cluster status of node rabbit@iut1r-srv-ops01-i01 ...
Basics
Cluster name: rabbit@iut1r-srv-ops01-i01.u-ga.fr
Disk Nodes
rabbit@iut1r-srv-ops01-i01
rabbit@iut1r-srv-ops02-i01
Running Nodes
rabbit@iut1r-srv-ops01-i01
rabbit@iut1r-srv-ops02-i01
Versions
rabbit@iut1r-srv-ops01-i01: RabbitMQ 3.9.20 on Erlang 24.3.4.2
rabbit@iut1r-srv-ops02-i01: RabbitMQ 3.9.20 on Erlang 24.3.4.2
Maintenance status
Node: rabbit@iut1r-srv-ops01-i01, status: not under maintenance
Node: rabbit@iut1r-srv-ops02-i01, status: not under maintenance
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbit@iut1r-srv-ops01-i01, interface: [::], port: 15672, protocol:
http, purpose: HTTP API
Node: rabbit@iut1r-srv-ops01-i01, interface: [::], port: 15692, protocol:
http/prometheus, purpose: Prometheus exporter API over HTTP
Node: rabbit@iut1r-srv-ops01-i01, interface: 10.0.5.109, port: 25672,
protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@iut1r-srv-ops01-i01, interface: 10.0.5.109, port: 5672,
protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Node: rabbit@iut1r-srv-ops02-i01, interface: [::], port: 15672, protocol:
http, purpose: HTTP API
Node: rabbit@iut1r-srv-ops02-i01, interface: [::], port: 15692, protocol:
http/prometheus, purpose: Prometheus exporter API over HTTP
Node: rabbit@iut1r-srv-ops02-i01, interface: 10.0.5.110, port: 25672,
protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@iut1r-srv-ops02-i01, interface: 10.0.5.110, port: 5672,
protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Feature flags
Flag: drop_unroutable_metric, state: enabled
Flag: empty_basic_get_metric, state: enabled
Flag: implicit_default_bindings, state: enabled
Flag: maintenance_mode_status, state: enabled
Flag: quorum_queue, state: enabled
Flag: stream_queue, state: enabled
Flag: user_limits, state: enabled
Flag: virtual_host_metadata, state: enabled
So… nothing strange for me.
All containers are healthy nom (after delete rabbitmq and rebuild
rabbitmq).
in addition to dhcp, communications on the network do not work.
If I create an instance, it has no ip address by dhcp.
If I give her a static ip, she can't reach the router.
If I create another instance, with another static ip, they don't
communicate with each other.
And they can't ping the router (or routers, I put 2, 1 on each of my 2
external networks)
There are some errors in rabbitmq…..log:
2022-11-12 08:53:37.155542+01:00 [error] <0.16179.2> missed heartbeats
from client, timeout: 60s
2022-11-12 08:54:54.026480+01:00 [error] <0.17357.2> closing AMQP
connection <0.17357.2> (10.0.5.109:37532 -> 10.0.5.109:5672 -
mod_wsgi:43:e50d8e69-7c76-4198-877c-c807e0a180d8):
2022-11-12 08:54:54.026480+01:00 [error] <0.17357.2> missed heartbeats
from client, timeout: 60s
There are some errors also in neutron-l3-agent.log
2022-11-11 22:04:42.512 37 ERROR oslo_service.periodic_task     message =
self.waiters.get(msg_id, timeout=timeout)
2022-11-11 22:04:42.512 37 ERROR oslo_service.periodic_task   File
"/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py",
line 445, in get
2022-11-11 22:04:42.512 37 ERROR oslo_service.periodic_task     'to
message ID %s' % msg_id)
2022-11-11 22:04:42.512 37 ERROR oslo_service.periodic_task
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
to message ID 297cacfadd764562bf09a1c5daf61958
Also in neutron-dhcp-agent.log
2022-11-11 22:04:44.854 7 ERROR neutron.agent.dhcp.agent     message =
self.waiters.get(msg_id, timeout=timeout)
2022-11-11 22:04:44.854 7 ERROR neutron.agent.dhcp.agent   File
"/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py",
line 445, in get
2022-11-11 22:04:44.854 7 ERROR neutron.agent.dhcp.agent     'to message
ID %s' % msg_id)
2022-11-11 22:04:44.854 7 ERROR neutron.agent.dhcp.agent
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply
to message ID 6f1d9d0c51ac4d89b9c889ca273f40a0
A lot of errors in neutron-metadata.log
2022-11-11 22:01:44.152 43 ERROR oslo.messaging._drivers.impl_rabbit [-]
[d7902e2c-eba9-40e4-b872-40e7ba7a39ec] AMQP server on 10.0.5.109:5672 is
unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1
seconds.: amqp.exceptions.RecoverableConnectionError:
<RecoverableConnectionError: unknown error>
2022-11-11 22:01:44.226 7 ERROR oslo.messaging._drivers.impl_rabbit [-]
[028872e4-fcd1-4de5-b20c-8c5541e3c77f] AMQP server on 10.0.5.109:5672 is
unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1
seconds.: amqp.exceptions.RecoverableConnectionError:
<RecoverableConnectionError: unknown error>
timeout …. waiting…. unreachable…. connectionerror….
Something is wrong, but I think it’s very difficult to find the problem.
To difficult for me.
« nc -v » works.
There are several things that can cause issues with Rabbit, or with
services sending messages. Rabbit itself is not always to blame. Things
I've seen cause issues before include:

1) Time not being in sync on all systems (covered that earlier)
2) DNS (it's always DNS, right?)
3) Networking issues like mismatched MTU
4) Nova being configured for a Ceph backend, but timing out trying to talk
to the cluster (messages would expire while Nova waited on it)
...
I do not know what to do.
...
I can lose all data (networks, instances, volumes, etc). I can start again
on a new config
Do I do it with kolla-ansible -i multinode destroy?
Yeah, just do kolla-ansible -i multinode destroy after backing up your
kolla configs.
Before switching to Yoga, I had a cluster under Xena. I kept my
...
configuration and a venv (python) with koll-ansible for Xena.
Am I going back to this version? How without doing stupid things?
I can't see any good reason to roll back to Xena. Yoga should be fine.
Changing should be as simple as swapping your VENV, and using your Xena
globals.yml, passwords.yml, inventory, and any other custom configs you had
for that version.
...
Thanks a lot.
Franck VEDEL
Le 11 nov. 2022 à 23:33, Laurent Dumont <laurentfdumont@gmail.com> a
écrit :
docker exec -it rabbitmq rabbitmqctl cluster_status

Re: [kolla-ansible]Reset Configuration

Erik McCormick