Hi Nguyễn, Oh!! make sense, In your post you did the following that is why i got confused :) Let me try it in /etc/kolla/config/global.conf file and run deploy.
nano /etc/kollla/global.conf
[oslo_messaging_rabbit] kombu_reconnect_delay=0.5
On Wed, Apr 12, 2023 at 10:45 AM Nguyễn Hữu Khôi <nguyenhuukhoinw@gmail.com> wrote:
Hi. Create global.conf in /etc/kolla/config/
On Wed, Apr 12, 2023, 9:42 PM Satish Patel <satish.txt@gmail.com> wrote:
Hi Matt,
How do I set kombu_reconnect_delay=0.5 option?
Something like the following in global.yml?
kombu_reconnect_delay: 0.5
On Wed, Apr 12, 2023 at 4:23 AM Matt Crees <mattc@stackhpc.com> wrote:
Hi all,
It seems worth noting here that there is a fix ongoing in oslo.messaging which will resolve the issues with HA failing when one node is down. See here: https://review.opendev.org/c/openstack/oslo.messaging/+/866617 In the meantime, we have also found that setting kombu_reconnect_delay = 0.5 does resolve this issue.
As for why om_enable_rabbitmq_high_availability is currently defaulting to false, as Michal said enabling it in stable releases will impact users. This is because it enables durable queues, and the migration from transient to durable queues is not a seamless procedure. It requires that the state of RabbitMQ is reset and that the OpenStack services which use RabbitMQ are restarted to recreate the queues.
I think that there is some merit in changing this default value. But if we did this, we should either add additional support to automate the migration from transient to durable queues, or at the very least provide some decent docs on the manual procedure.
However, as classic queue mirroring is deprecated in RabbitMQ (to be removed in RabbitMQ 4.0) we should maybe consider switching to quorum queues soon. Then it may be beneficial to leave the classic queue mirroring + durable queues setup as False by default. This is because the migration between queue types (durable or quorum) can take several hours on larger deployments. So it might be worth making sure the default values only require one migration to quorum queues in the future, rather than two (durable queues now and then quorum queues in the future).
We will need to make this switch eventually, but right now RabbitMQ 4.0 does not even have a set release date, so it's not the most urgent change.
Cheers, Matt
Hi Michal,
Feel free to propose change of default in master branch, but I don?t think we can change the default in stable branches without impacting users.
Best regards, Michal
On 11 Apr 2023, at 15:18, Michal Arbet <michal.arbet@ultimum.io> wrote:
Hi,
Btw, why we have such option set to false ? Michal Arbet Openstack Engineer
Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic
+420 604 228 897 <> michal.arbet@ultimum.io <mailto:michal.arbet@ultimum.io> https://ultimum.io <https://ultimum.io/>
LinkedIn <https://www.linkedin.com/company/ultimum-technologies> | Twitter <https://twitter.com/ultimumtech> | Facebook < https://www.facebook.com/ultimumtechnologies/timeline>
?t 11. 4. 2023 v 14:48 odes?latel Micha? Nasiadka < mnasiadka@gmail.com <mailto:mnasiadka@gmail.com>> napsal:
Hello,
RabbitMQ HA has been backported into stable releases, and it?s documented here:
https://docs.openstack.org/kolla-ansible/yoga/reference/message-queues/rabbi...
Best regards, Michal
W dniu wt., 11.04.2023 o 13:32 Nguy?n H?u Kh?i <
nguyenhuukhoinw@gmail.com <mailto:nguyenhuukhoinw@gmail.com>> napisa?(a):
> Yes. > But cluster cannot work properly without it. :( > > On Tue, Apr 11, 2023, 6:18 PM Danny Webb < Danny.Webb@thehutgroup.com <mailto:Danny.Webb@thehutgroup.com>> wrote: >> This commit explains why they largely removed HA queue durability: >> >> https://opendev.org/openstack/kolla-ansible/commit/2764844ee2ff9393a4eebd90a... >> From: Satish Patel <satish.txt@gmail.com <mailto: satish.txt@gmail.com>> >> Sent: 09 April 2023 04:16 >> To: Nguy?n H?u Kh?i <nguyenhuukhoinw@gmail.com <mailto: nguyenhuukhoinw@gmail.com>> >> Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org <mailto:openstack-discuss@lists.openstack.org>> >> Subject: Re: [openstack][sharing][kolla ansible]Problems when 1 of 3 controller was be down >> >> >> CAUTION: This email originates from outside THG >> >> Are you proposing a solution or just raising an issue? >> >> I did find it strange that kolla-ansible doesn't support HA queue by default. That is a disaster because when one of the nodes goes down it will make the whole rabbitMQ unacceptable. Whenever i deploy kolla i have to add HA policy to make queue HA otherwise you will endup in problem. >> >> On Sat, Apr 8, 2023 at 6:40?AM Nguy?n H?u Kh?i < nguyenhuukhoinw@gmail.com <mailto:nguyenhuukhoinw@gmail.com>> wrote: >> Hello everyone. >> >> I want to summary for who meets problems with Openstack when deploy cluster with 3 controller using Kolla Ansible >> >> Scenario: 1 of 3 controller is down >> >> 1. Login horizon and use API such as nova, cinder will be very slow >> >> fix by: >> >> nano: >> kolla-ansible/ansible/roles/heat/templates/heat.conf.j2 >> kolla-ansible/ansible/roles/nova/templates/nova.conf.j2 >> kolla-ansible/ansible/roles/keystone/templates/keystone.conf.j2 >> kolla-ansible/ansible/roles/neutron/templates/neutron.conf.j2 >> kolla-ansible/ansible/roles/cinder/templates/cinder.conf.j2 >> >> or which service need caches >> >> add as below >> >> [cache] >> backend = oslo_cache.memcache_pool >> enabled = True >> memcache_servers = {{ kolla_internal_vip_address }}:{{ memcached_port }} >> memcache_dead_retry = 0.25 >> memcache_socket_timeout = 900 >> >> https://review.opendev.org/c/openstack/kolla-ansible/+/849487 >> >> but it is not the end >> >> 2. Cannot launch instance or mapping block device(stuck at this step) >> >> nano kolla-ansible/ansible/roles/rabbitmq/templates/definitions.json.j2 >> >> "policies":[ >> {"vhost": "/", "name": "ha-all", "pattern": "^(?!(amq\.)|(.*_fanout_)|(reply_)).*", "apply-to": "all", "definition": {"ha-mode":"all"}, "priority":0}{% if project_name == 'outward_rabbitmq' %}, >> {"vhost": "{{ murano_agent_rabbitmq_vhost }}", "name": "ha-all", "pattern": ".*", "apply-to": "all", "definition": {"ha-mode":"all"}, "priority":0} >> {% endif %} >> ] >> >> nano /etc/kollla/global.conf >> >> [oslo_messaging_rabbit] >> kombu_reconnect_delay=0.5 >> >> >> https://bugs.launchpad.net/oslo.messaging/+bug/1993149 >> https://docs.openstack.org/large-scale/journey/configure/rabbitmq.html >> >> I used Xena 13.4 and Yoga 14.8.1. >> >> Above bugs are critical, but I see that it was not fixed. I am just an operator and I want to share what I encountered for new people who come to Openstack >> >> >> Nguyen Huu Khoi -- Micha? Nasiadka mnasiadka@gmail.com <mailto:mnasiadka@gmail.com>