You don't need to create a new thread with the same issue. Do the rabbitmq logs reveal anything? We create a cluster within rabbitmq and the output looks like this: ---snip--- control01:~ # rabbitmqctl cluster_status Cluster status of node rabbit@control01 ... Basics Cluster name: rabbit@rabbitmq-cluster Disk Nodes rabbit@control01 rabbit@control02 rabbit@control03 Running Nodes rabbit@control01 rabbit@control02 rabbit@control03 Versions rabbit@control01: RabbitMQ 3.8.3 on Erlang 22.2.7 rabbit@control02: RabbitMQ 3.8.3 on Erlang 22.2.7 rabbit@control03: RabbitMQ 3.8.3 on Erlang 22.2.7 ---snip--- During failover it's not unexpected that a message gets lost, but it should be resent, I believe. How is your openstack deployed? Zitat von Nguyễn Hữu Khôi <nguyenhuukhoinw@gmail.com>:
Hello. 2 remain nodes still running, here is my output: Basics
Cluster name: rabbit@controller01
Disk Nodes
rabbit@controller01 rabbit@controller02 rabbit@controller03
Running Nodes
rabbit@controller01 rabbit@controller03
Versions
rabbit@controller01: RabbitMQ 3.8.35 on Erlang 23.3.4.18 rabbit@controller03: RabbitMQ 3.8.35 on Erlang 23.3.4.18
Maintenance status
Node: rabbit@controller01, status: not under maintenance Node: rabbit@controller03, status: not under maintenance
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbit@controller01, interface: [::], port: 15672, protocol: http, purpose: HTTP API Node: rabbit@controller01, interface: 183.81.13.227, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@controller01, interface: 183.81.13.227, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@controller03, interface: [::], port: 15672, protocol: http, purpose: HTTP API Node: rabbit@controller03, interface: 183.81.13.229, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@controller03, interface: 183.81.13.229, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Feature flags
Flag: drop_unroutable_metric, state: enabled Flag: empty_basic_get_metric, state: enabled Flag: implicit_default_bindings, state: enabled Flag: maintenance_mode_status, state: enabled Flag: quorum_queue, state: enabled Flag: user_limits, state: enabled Flag: virtual_host_metadata, state: enabled
I used ha_queues mode all But it is not better. Nguyen Huu Khoi
On Tue, Oct 18, 2022 at 7:19 AM Nguyễn Hữu Khôi <nguyenhuukhoinw@gmail.com> wrote:
Description =========== I set up 3 controllers and 3 compute nodes. My system cannot work well when 1 rabbit node in cluster rabbitmq is down, cannot launch instances. It stucked at scheduling.
Steps to reproduce =========== Openstack nodes point rabbit://node1:5672,node2:5672,node3:5672// * Reboot 1 of 3 rabbitmq node. * Create instances then it stucked at scheduling.
Workaround =========== Point to rabbitmq VIP address. But We cannot share the load with this solution. Please give me some suggestions. Thank you very much. I did google and enabled system log's debug but I still cannot understand why.
Nguyen Huu Khoi