<div dir="auto">Hi,<div dir="auto"><br></div><div dir="auto">Already looked in Oslo.messaging, but rabbitmq is the only stable driver :(</div><div dir="auto"><br></div><div dir="auto">Kafka is marked as experimental and (if the docs are correct) is only usable for notifications.</div><div dir="auto"><br></div><div dir="auto">Would love to switch to an alternate. </div><div dir="auto"><br></div><div dir="auto"> Fabian </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Satish Patel <<a href="mailto:satish.txt@gmail.com">satish.txt@gmail.com</a>> schrieb am So., 16. Aug. 2020, 02:13:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Sean,<br>

<br>

Sounds good, but running rabbitmq for each service going to be little<br>

overhead also, how do you scale cluster (Yes we can use cellv2 but its<br>

not something everyone like to do because of complexity). If we thinks<br>

rabbitMQ is growing pain then why community not looking for<br>

alternative option (kafka) etc..?<br>

<br>

On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <<a href="mailto:smooney@redhat.com" target="_blank" rel="noreferrer">smooney@redhat.com</a>> wrote:<br>

><br>

> On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:<br>

> > Hi,<br>

> ><br>

> > i read somewhere that vexxhosts kubernetes openstack-Operator is running<br>

> > one rabbitmq Container per Service. Just the kubernetes self healing is<br>

> > used as "ha" for rabbitmq.<br>

> ><br>

> > That seems to match with my finding: run rabbitmq standalone and use an<br>

> > external system to restart rabbitmq if required.<br>

> thats the design that was orginally planned for kolla-kubernetes orrignally<br>

><br>

> each service was to be deployed with its own rabbit mq server if it required one<br>

> and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster<br>

> and if you trust k8s or the external service enough to ensure it is recteated it<br>

> should be as effective a solution. you dont even need k8s to do that but it seams to be<br>

> a good fit if  your prepared to ocationally loose inflight rpcs.<br>

> if you not then you can configure rabbit to persite all message to disk and mont that on a shared<br>

> file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is<br>

> perserved. assuming you can take the perfromance hit of writing all messages to disk that is.<br>

> ><br>

> >  Fabian<br>

> ><br>

> > Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank" rel="noreferrer">satish.txt@gmail.com</a>> schrieb am Fr., 14. Aug. 2020, 16:59:<br>

> ><br>

> > > Fabian,<br>

> > ><br>

> > > what do you mean?<br>

> > ><br>

> > > > > I think vexxhost is running (1) with their openstack-operator - for<br>

> > ><br>

> > > reasons.<br>

> > ><br>

> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <<a href="mailto:dev.faz@gmail.com" target="_blank" rel="noreferrer">dev.faz@gmail.com</a>><br>

> > > wrote:<br>

> > > ><br>

> > > > Hello again,<br>

> > > ><br>

> > > > just a short update about the results of my tests.<br>

> > > ><br>

> > > > I currently see 2 ways of running openstack+rabbitmq<br>

> > > ><br>

> > > > 1. without durable-queues and without replication - just one<br>

> > ><br>

> > > rabbitmq-process which gets (somehow) restarted if it fails.<br>

> > > > 2. durable-queues and replication<br>

> > > ><br>

> > > > Any other combination of these settings leads to more or less issues with<br>

> > > ><br>

> > > > * broken / non working bindings<br>

> > > > * broken queues<br>

> > > ><br>

> > > > I think vexxhost is running (1) with their openstack-operator - for<br>

> > ><br>

> > > reasons.<br>

> > > ><br>

> > > > I added [kolla], because kolla-ansible is installing rabbitmq with<br>

> > ><br>

> > > replication but without durable-queues.<br>

> > > ><br>

> > > > May someone point me to the best way to document these findings to some<br>

> > ><br>

> > > official doc?<br>

> > > > I think a lot of installations out there will run into issues if - under<br>

> > ><br>

> > > load - a node fails.<br>

> > > ><br>

> > > >  Fabian<br>

> > > ><br>

> > > ><br>

> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <<br>

> > ><br>

> > > <a href="mailto:dev.faz@gmail.com" target="_blank" rel="noreferrer">dev.faz@gmail.com</a>>:<br>

> > > > ><br>

> > > > > Hi,<br>

> > > > ><br>

> > > > > just did some short tests today in our test-environment (without<br>

> > ><br>

> > > durable queues and without replication):<br>

> > > > ><br>

> > > > > * started a rally task to generate some load<br>

> > > > > * kill-9-ed rabbitmq on one node<br>

> > > > > * rally task immediately stopped and the cloud (mostly) stopped working<br>

> > > > ><br>

> > > > > after some debugging i found (again) exchanges which had bindings to<br>

> > ><br>

> > > queues, but these bindings didnt forward any msgs.<br>

> > > > > Wrote a small script to detect these broken bindings and will now check<br>

> > ><br>

> > > if this is "reproducible"<br>

> > > > ><br>

> > > > > then I will try "durable queues" and "durable queues with replication"<br>

> > ><br>

> > > to see if this helps. Even if I would expect<br>

> > > > > rabbitmq should be able to handle this without these "hidden broken<br>

> > ><br>

> > > bindings"<br>

> > > > ><br>

> > > > > This just FYI.<br>

> > > > ><br>

> > > > >  Fabian<br>

><br>

</blockquote></div>