<div dir="ltr">Matt, <div></div><div><br></div><div>For new deployment how do I enable the Quorum queue? </div><div><br></div><div>Just adding the following should be enough? </div><div><br></div><div>om_enable_rabbitmq_high_availability: True </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 12, 2023 at 10:54 AM Matt Crees <<a href="mailto:mattc@stackhpc.com">mattc@stackhpc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Yes, and the option also needs to be under the oslo_messaging_rabbit heading:<br>
<br>
[oslo_messaging_rabbit]<br>
kombu_reconnect_delay=0.5<br>
<br>
<br>
On Wed, 12 Apr 2023 at 15:45, Nguyễn Hữu Khôi <<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a>> wrote:<br>
><br>
> Hi.<br>
> Create global.conf in /etc/kolla/config/<br>
><br>
> On Wed, Apr 12, 2023, 9:42 PM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>
>><br>
>> Hi Matt,<br>
>><br>
>> How do I set kombu_reconnect_delay=0.5 option?<br>
>><br>
>> Something like the following in global.yml?<br>
>><br>
>> kombu_reconnect_delay: 0.5<br>
>><br>
>> On Wed, Apr 12, 2023 at 4:23 AM Matt Crees <<a href="mailto:mattc@stackhpc.com" target="_blank">mattc@stackhpc.com</a>> wrote:<br>
>>><br>
>>> Hi all,<br>
>>><br>
>>> It seems worth noting here that there is a fix ongoing in<br>
>>> oslo.messaging which will resolve the issues with HA failing when one<br>
>>> node is down. See here:<br>
>>> <a href="https://review.opendev.org/c/openstack/oslo.messaging/+/866617" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/oslo.messaging/+/866617</a><br>
>>> In the meantime, we have also found that setting kombu_reconnect_delay<br>
>>> = 0.5 does resolve this issue.<br>
>>><br>
>>> As for why om_enable_rabbitmq_high_availability is currently<br>
>>> defaulting to false, as Michal said enabling it in stable releases<br>
>>> will impact users. This is because it enables durable queues, and the<br>
>>> migration from transient to durable queues is not a seamless<br>
>>> procedure. It requires that the state of RabbitMQ is reset and that<br>
>>> the OpenStack services which use RabbitMQ are restarted to recreate<br>
>>> the queues.<br>
>>><br>
>>> I think that there is some merit in changing this default value. But<br>
>>> if we did this, we should either add additional support to automate<br>
>>> the migration from transient to durable queues, or at the very least<br>
>>> provide some decent docs on the manual procedure.<br>
>>><br>
>>> However, as classic queue mirroring is deprecated in RabbitMQ (to be<br>
>>> removed in RabbitMQ 4.0) we should maybe consider switching to quorum<br>
>>> queues soon. Then it may be beneficial to leave the classic queue<br>
>>> mirroring + durable queues setup as False by default. This is because<br>
>>> the migration between queue types (durable or quorum) can take several<br>
>>> hours on larger deployments. So it might be worth making sure the<br>
>>> default values only require one migration to quorum queues in the<br>
>>> future, rather than two (durable queues now and then quorum queues in<br>
>>> the future).<br>
>>><br>
>>> We will need to make this switch eventually, but right now RabbitMQ<br>
>>> 4.0 does not even have a set release date, so it's not the most urgent<br>
>>> change.<br>
>>><br>
>>> Cheers,<br>
>>> Matt<br>
>>><br>
>>> >Hi Michal,<br>
>>> ><br>
>>> >Feel free to propose change of default in master branch, but I don?t think we can change the default in stable branches without impacting users.<br>
>>> ><br>
>>> >Best regards,<br>
>>> >Michal<br>
>>> ><br>
>>> >> On 11 Apr 2023, at 15:18, Michal Arbet <<a href="mailto:michal.arbet@ultimum.io" target="_blank">michal.arbet@ultimum.io</a>> wrote:<br>
>>> >><br>
>>> >> Hi,<br>
>>> >><br>
>>> >> Btw, why we have such option set to false ?<br>
>>> >> Michal Arbet<br>
>>> >> Openstack Engineer<br>
>>> >><br>
>>> >> Ultimum Technologies a.s.<br>
>>> >> Na Po???? 1047/26, 11000 Praha 1<br>
>>> >> Czech Republic<br>
>>> >><br>
>>> >> +420 604 228 897 <><br>
>>> >> <a href="mailto:michal.arbet@ultimum.io" target="_blank">michal.arbet@ultimum.io</a> <mailto:<a href="mailto:michal.arbet@ultimum.io" target="_blank">michal.arbet@ultimum.io</a>><br>
>>> >> <a href="https://ultimum.io" rel="noreferrer" target="_blank">https://ultimum.io</a> <<a href="https://ultimum.io/" rel="noreferrer" target="_blank">https://ultimum.io/</a>><br>
>>> >><br>
>>> >> LinkedIn <<a href="https://www.linkedin.com/company/ultimum-technologies" rel="noreferrer" target="_blank">https://www.linkedin.com/company/ultimum-technologies</a>> | Twitter <<a href="https://twitter.com/ultimumtech" rel="noreferrer" target="_blank">https://twitter.com/ultimumtech</a>> | Facebook <<a href="https://www.facebook.com/ultimumtechnologies/timeline" rel="noreferrer" target="_blank">https://www.facebook.com/ultimumtechnologies/timeline</a>><br>
>>> >><br>
>>> >><br>
>>> >> ?t 11. 4. 2023 v 14:48 odes?latel Micha? Nasiadka <<a href="mailto:mnasiadka@gmail.com" target="_blank">mnasiadka@gmail.com</a> <mailto:<a href="mailto:mnasiadka@gmail.com" target="_blank">mnasiadka@gmail.com</a>>> napsal:<br>
>>> >>> Hello,<br>
>>> >>><br>
>>> >>> RabbitMQ HA has been backported into stable releases, and it?s documented here:<br>
>>> >>> <a href="https://docs.openstack.org/kolla-ansible/yoga/reference/message-queues/rabbitmq.html#high-availability" rel="noreferrer" target="_blank">https://docs.openstack.org/kolla-ansible/yoga/reference/message-queues/rabbitmq.html#high-availability</a><br>
>>> >>><br>
>>> >>> Best regards,<br>
>>> >>> Michal<br>
>>> >>><br>
>>> >>> W dniu wt., 11.04.2023 o 13:32 Nguy?n H?u Kh?i <<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a> <mailto:<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a>>> napisa?(a):<br>
>>> >>>> Yes.<br>
>>> >>>> But cluster cannot work properly without it. :(<br>
>>> >>>><br>
>>> >>>> On Tue, Apr 11, 2023, 6:18 PM Danny Webb <<a href="mailto:Danny.Webb@thehutgroup.com" target="_blank">Danny.Webb@thehutgroup.com</a> <mailto:<a href="mailto:Danny.Webb@thehutgroup.com" target="_blank">Danny.Webb@thehutgroup.com</a>>> wrote:<br>
>>> >>>>> This commit explains why they largely removed HA queue durability:<br>
>>> >>>>><br>
>>> >>>>> <a href="https://opendev.org/openstack/kolla-ansible/commit/2764844ee2ff9393a4eebd90a9a912588af0a180" rel="noreferrer" target="_blank">https://opendev.org/openstack/kolla-ansible/commit/2764844ee2ff9393a4eebd90a9a912588af0a180</a><br>
>>> >>>>> From: Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a> <mailto:<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>>><br>
>>> >>>>> Sent: 09 April 2023 04:16<br>
>>> >>>>> To: Nguy?n H?u Kh?i <<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a> <mailto:<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a>>><br>
>>> >>>>> Cc: OpenStack Discuss <<a href="mailto:openstack-discuss@lists.openstack.org" target="_blank">openstack-discuss@lists.openstack.org</a> <mailto:<a href="mailto:openstack-discuss@lists.openstack.org" target="_blank">openstack-discuss@lists.openstack.org</a>>><br>
>>> >>>>> Subject: Re: [openstack][sharing][kolla ansible]Problems when 1 of 3 controller was be down<br>
>>> >>>>><br>
>>> >>>>><br>
>>> >>>>> CAUTION: This email originates from outside THG<br>
>>> >>>>><br>
>>> >>>>> Are you proposing a solution or just raising an issue?<br>
>>> >>>>><br>
>>> >>>>> I did find it strange that kolla-ansible doesn't support HA queue by default. That is a disaster because when one of the nodes goes down it will make the whole rabbitMQ unacceptable. Whenever i deploy kolla i have to add HA policy to make queue HA otherwise you will endup in problem.<br>
>>> >>>>><br>
>>> >>>>> On Sat, Apr 8, 2023 at 6:40?AM Nguy?n H?u Kh?i <<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a> <mailto:<a href="mailto:nguyenhuukhoinw@gmail.com" target="_blank">nguyenhuukhoinw@gmail.com</a>>> wrote:<br>
>>> >>>>> Hello everyone.<br>
>>> >>>>><br>
>>> >>>>> I want to summary for who meets problems with Openstack when deploy cluster with 3 controller using Kolla Ansible<br>
>>> >>>>><br>
>>> >>>>> Scenario: 1 of 3 controller is down<br>
>>> >>>>><br>
>>> >>>>> 1. Login horizon and use API such as nova, cinder will be very slow<br>
>>> >>>>><br>
>>> >>>>> fix by:<br>
>>> >>>>><br>
>>> >>>>> nano:<br>
>>> >>>>> kolla-ansible/ansible/roles/heat/templates/heat.conf.j2<br>
>>> >>>>> kolla-ansible/ansible/roles/nova/templates/nova.conf.j2<br>
>>> >>>>> kolla-ansible/ansible/roles/keystone/templates/keystone.conf.j2<br>
>>> >>>>> kolla-ansible/ansible/roles/neutron/templates/neutron.conf.j2<br>
>>> >>>>> kolla-ansible/ansible/roles/cinder/templates/cinder.conf.j2<br>
>>> >>>>><br>
>>> >>>>> or which service need caches<br>
>>> >>>>><br>
>>> >>>>> add as below<br>
>>> >>>>><br>
>>> >>>>> [cache]<br>
>>> >>>>> backend = oslo_cache.memcache_pool<br>
>>> >>>>> enabled = True<br>
>>> >>>>> memcache_servers = {{ kolla_internal_vip_address }}:{{ memcached_port }}<br>
>>> >>>>> memcache_dead_retry = 0.25<br>
>>> >>>>> memcache_socket_timeout = 900<br>
>>> >>>>><br>
>>> >>>>> <a href="https://review.opendev.org/c/openstack/kolla-ansible/+/849487" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/kolla-ansible/+/849487</a><br>
>>> >>>>><br>
>>> >>>>> but it is not the end<br>
>>> >>>>><br>
>>> >>>>> 2. Cannot launch instance or mapping block device(stuck at this step)<br>
>>> >>>>><br>
>>> >>>>> nano kolla-ansible/ansible/roles/rabbitmq/templates/definitions.json.j2<br>
>>> >>>>><br>
>>> >>>>> "policies":[<br>
>>> >>>>> {"vhost": "/", "name": "ha-all", "pattern": "^(?!(amq\.)|(.*_fanout_)|(reply_)).*", "apply-to": "all", "definition": {"ha-mode":"all"}, "priority":0}{% if project_name == 'outward_rabbitmq' %},<br>
>>> >>>>> {"vhost": "{{ murano_agent_rabbitmq_vhost }}", "name": "ha-all", "pattern": ".*", "apply-to": "all", "definition": {"ha-mode":"all"}, "priority":0}<br>
>>> >>>>> {% endif %}<br>
>>> >>>>> ]<br>
>>> >>>>><br>
>>> >>>>> nano /etc/kollla/global.conf<br>
>>> >>>>><br>
>>> >>>>> [oslo_messaging_rabbit]<br>
>>> >>>>> kombu_reconnect_delay=0.5<br>
>>> >>>>><br>
>>> >>>>><br>
>>> >>>>> <a href="https://bugs.launchpad.net/oslo.messaging/+bug/1993149" rel="noreferrer" target="_blank">https://bugs.launchpad.net/oslo.messaging/+bug/1993149</a><br>
>>> >>>>> <a href="https://docs.openstack.org/large-scale/journey/configure/rabbitmq.html" rel="noreferrer" target="_blank">https://docs.openstack.org/large-scale/journey/configure/rabbitmq.html</a><br>
>>> >>>>><br>
>>> >>>>> I used Xena 13.4 and Yoga 14.8.1.<br>
>>> >>>>><br>
>>> >>>>> Above bugs are critical, but I see that it was not fixed. I am just an operator and I want to share what I encountered for new people who come to Openstack<br>
>>> >>>>><br>
>>> >>>>><br>
>>> >>>>> Nguyen Huu Khoi<br>
>>> >>> --<br>
>>> >>> Micha? Nasiadka<br>
>>> >>> <a href="mailto:mnasiadka@gmail.com" target="_blank">mnasiadka@gmail.com</a> <mailto:<a href="mailto:mnasiadka@gmail.com" target="_blank">mnasiadka@gmail.com</a>><br>
>>><br>
</blockquote></div>