Openstack cluster cannot create instances when 1 of 3 rabbitmq cluster node down

Eugen Block eblock at nde.ag
Mon Oct 24 10:35:57 UTC 2022


You don't need to create a new thread with the same issue.
Do the rabbitmq logs reveal anything? We create a cluster within  
rabbitmq and the output looks like this:

---snip---
control01:~ # rabbitmqctl cluster_status
Cluster status of node rabbit at control01 ...
Basics

Cluster name: rabbit at rabbitmq-cluster

Disk Nodes

rabbit at control01
rabbit at control02
rabbit at control03

Running Nodes

rabbit at control01
rabbit at control02
rabbit at control03

Versions

rabbit at control01: RabbitMQ 3.8.3 on Erlang 22.2.7
rabbit at control02: RabbitMQ 3.8.3 on Erlang 22.2.7
rabbit at control03: RabbitMQ 3.8.3 on Erlang 22.2.7
---snip---

During failover it's not unexpected that a message gets lost, but it  
should be resent, I believe. How is your openstack deployed?


Zitat von Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>:

> Hello.
> 2 remain nodes still running, here is my output:
> Basics
>
> Cluster name: rabbit at controller01
>
> Disk Nodes
>
> rabbit at controller01
> rabbit at controller02
> rabbit at controller03
>
> Running Nodes
>
> rabbit at controller01
> rabbit at controller03
>
> Versions
>
> rabbit at controller01: RabbitMQ 3.8.35 on Erlang 23.3.4.18
> rabbit at controller03: RabbitMQ 3.8.35 on Erlang 23.3.4.18
>
> Maintenance status
>
> Node: rabbit at controller01, status: not under maintenance
> Node: rabbit at controller03, status: not under maintenance
>
> Alarms
>
> (none)
>
> Network Partitions
>
> (none)
>
> Listeners
>
> Node: rabbit at controller01, interface: [::], port: 15672, protocol: http,
> purpose: HTTP API
> Node: rabbit at controller01, interface: 183.81.13.227, port: 25672, protocol:
> clustering, purpose: inter-node and CLI tool communication
> Node: rabbit at controller01, interface: 183.81.13.227, port: 5672, protocol:
> amqp, purpose: AMQP 0-9-1 and AMQP 1.0
> Node: rabbit at controller03, interface: [::], port: 15672, protocol: http,
> purpose: HTTP API
> Node: rabbit at controller03, interface: 183.81.13.229, port: 25672, protocol:
> clustering, purpose: inter-node and CLI tool communication
> Node: rabbit at controller03, interface: 183.81.13.229, port: 5672, protocol:
> amqp, purpose: AMQP 0-9-1 and AMQP 1.0
>
> Feature flags
>
> Flag: drop_unroutable_metric, state: enabled
> Flag: empty_basic_get_metric, state: enabled
> Flag: implicit_default_bindings, state: enabled
> Flag: maintenance_mode_status, state: enabled
> Flag: quorum_queue, state: enabled
> Flag: user_limits, state: enabled
> Flag: virtual_host_metadata, state: enabled
>
> I used ha_queues mode all
> But it is not better.
> Nguyen Huu Khoi
>
>
> On Tue, Oct 18, 2022 at 7:19 AM Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>
> wrote:
>
>> Description
>> ===========
>> I set up 3 controllers and 3 compute nodes. My system cannot work well
>> when 1 rabbit node in cluster rabbitmq is down, cannot launch instances. It
>> stucked at scheduling.
>>
>> Steps to reproduce
>> ===========
>> Openstack nodes point rabbit://node1:5672,node2:5672,node3:5672//
>> * Reboot 1 of 3 rabbitmq node.
>> * Create instances then it stucked at scheduling.
>>
>> Workaround
>> ===========
>> Point to rabbitmq VIP address. But We cannot share the load with this
>> solution. Please give me some suggestions. Thank you very much.
>> I did google and enabled system log's debug but I still cannot understand
>> why.
>>
>> Nguyen Huu Khoi
>>






More information about the openstack-discuss mailing list