[openstack][sharing][kolla ansible]Problems when 1 of 3 controller was be down

Michał Nasiadka mnasiadka at gmail.com
Tue Apr 11 12:42:04 UTC 2023


Hello,

RabbitMQ HA has been backported into stable releases, and it’s documented
here:
https://docs.openstack.org/kolla-ansible/yoga/reference/message-queues/rabbitmq.html#high-availability

Best regards,
Michal

W dniu wt., 11.04.2023 o 13:32 Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>
napisał(a):

> Yes.
> But cluster cannot work properly without it. :(
>
> On Tue, Apr 11, 2023, 6:18 PM Danny Webb <Danny.Webb at thehutgroup.com>
> wrote:
>
>> This commit explains why they largely removed HA queue durability:
>>
>>
>> https://opendev.org/openstack/kolla-ansible/commit/2764844ee2ff9393a4eebd90a9a912588af0a180
>> ------------------------------
>> *From:* Satish Patel <satish.txt at gmail.com>
>> *Sent:* 09 April 2023 04:16
>> *To:* Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>
>> *Cc:* OpenStack Discuss <openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack][sharing][kolla ansible]Problems when 1 of 3
>> controller was be down
>>
>>
>> * CAUTION: This email originates from outside THG *
>> ------------------------------
>> Are you proposing a solution or just raising an issue?
>>
>> I did find it strange that kolla-ansible doesn't support HA queue by
>> default. That is a disaster because when one of the nodes goes down it will
>> make the whole rabbitMQ unacceptable. Whenever i deploy kolla i have to add
>> HA policy to make queue HA otherwise you will endup in problem.
>>
>> On Sat, Apr 8, 2023 at 6:40 AM Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>
>> wrote:
>>
>> Hello everyone.
>>
>> I want to summary for who meets problems with Openstack when deploy
>> cluster with 3 controller using Kolla Ansible
>>
>> Scenario: 1 of 3 controller is down
>>
>> 1. Login horizon and use API such as nova, cinder will be very slow
>>
>> fix by:
>>
>> nano:
>> kolla-ansible/ansible/roles/heat/templates/heat.conf.j2
>> kolla-ansible/ansible/roles/nova/templates/nova.conf.j2
>> kolla-ansible/ansible/roles/keystone/templates/keystone.conf.j2
>> kolla-ansible/ansible/roles/neutron/templates/neutron.conf.j2
>> kolla-ansible/ansible/roles/cinder/templates/cinder.conf.j2
>>
>> or which service need caches
>>
>> add as below
>>
>> [cache]
>> backend = oslo_cache.memcache_pool
>> enabled = True
>> memcache_servers = {{ kolla_internal_vip_address }}:{{ memcached_port }}
>> memcache_dead_retry = 0.25
>> memcache_socket_timeout = 900
>>
>> https://review.opendev.org/c/openstack/kolla-ansible/+/849487
>>
>> but it is not the end
>>
>> 2. Cannot launch instance or mapping block device(stuck at this step)
>>
>> nano kolla-ansible/ansible/roles/rabbitmq/templates/definitions.json.j2
>>
>> "policies":[
>>     {"vhost": "/", "name": "ha-all", "pattern":
>> "^(?!(amq\.)|(.*_fanout_)|(reply_)).*", "apply-to": "all", "definition":
>> {"ha-mode":"all"}, "priority":0}{% if project_name == 'outward_rabbitmq' %},
>>     {"vhost": "{{ murano_agent_rabbitmq_vhost }}", "name": "ha-all",
>> "pattern": ".*", "apply-to": "all", "definition": {"ha-mode":"all"},
>> "priority":0}
>>     {% endif %}
>>   ]
>>
>> nano /etc/kollla/global.conf
>>
>> [oslo_messaging_rabbit]
>> kombu_reconnect_delay=0.5
>>
>>
>> https://bugs.launchpad.net/oslo.messaging/+bug/1993149
>> https://docs.openstack.org/large-scale/journey/configure/rabbitmq.html
>>
>> I used Xena 13.4 and Yoga 14.8.1.
>>
>> Above bugs are critical, but I see that it was not fixed. I am just an
>> operator and I want to share what I encountered for new people who come to
>> Openstack
>>
>>
>> Nguyen Huu Khoi
>>
>> --
Michał Nasiadka
mnasiadka at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230411/d6e75111/attachment-0001.htm>


More information about the openstack-discuss mailing list