[nova][neutron][oslo][ops][kolla] rabbit bindings issue

Ben Nemec openstack at nemebean.com
Mon Aug 17 16:13:15 UTC 2020



On 8/16/20 3:48 AM, Tobias Urdin wrote:
> Hello,
> 
> Kind of off topic but I’ve been starting doing some research to see if a 
> KubeMQ driver could be added to oslo.messaging

You may want to take a look at 
https://docs.openstack.org/oslo.messaging/latest/contributor/supported-messaging-drivers.html 


We've had bad luck with adding new drivers to oslo.messaging in the 
past, so we've tried to come up with a policy that gives them the best 
possible chance of being successful. It does set a rather high bar for 
integration though.

Also take a look at https://review.opendev.org/#/c/692784/ A lot of the 
discussion there may be relevant to another new driver.

> 
> Best regards
> 
>> On 16 Aug 2020, at 07:44, Fabian Zimmermann <dev.faz at gmail.com> wrote:
>>
>> 
>> Hi,
>>
>> Already looked in Oslo.messaging, but rabbitmq is the only stable 
>> driver :(
>>
>> Kafka is marked as experimental and (if the docs are correct) is only 
>> usable for notifications.
>>
>> Would love to switch to an alternate.
>>
>>  Fabian
>>
>> Satish Patel <satish.txt at gmail.com <mailto:satish.txt at gmail.com>> 
>> schrieb am So., 16. Aug. 2020, 02:13:
>>
>>     Hi Sean,
>>
>>     Sounds good, but running rabbitmq for each service going to be little
>>     overhead also, how do you scale cluster (Yes we can use cellv2 but its
>>     not something everyone like to do because of complexity). If we thinks
>>     rabbitMQ is growing pain then why community not looking for
>>     alternative option (kafka) etc..?
>>
>>     On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com
>>     <mailto:smooney at redhat.com>> wrote:
>>     >
>>     > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
>>     > > Hi,
>>     > >
>>     > > i read somewhere that vexxhosts kubernetes openstack-Operator
>>     is running
>>     > > one rabbitmq Container per Service. Just the kubernetes self
>>     healing is
>>     > > used as "ha" for rabbitmq.
>>     > >
>>     > > That seems to match with my finding: run rabbitmq standalone
>>     and use an
>>     > > external system to restart rabbitmq if required.
>>     > thats the design that was orginally planned for kolla-kubernetes
>>     orrignally
>>     >
>>     > each service was to be deployed with its own rabbit mq server if
>>     it required one
>>     > and if it crashed it woudl just be recreated by k8s. it
>>     perfromace better then a cluster
>>     > and if you trust k8s or the external service enough to ensure it
>>     is recteated it
>>     > should be as effective a solution. you dont even need k8s to do
>>     that but it seams to be
>>     > a good fit if  your prepared to ocationally loose inflight rpcs.
>>     > if you not then you can configure rabbit to persite all message
>>     to disk and mont that on a shared
>>     > file system like nfs or cephfs so that when the rabbit instance
>>     is recreated the queue contency is
>>     > perserved. assuming you can take the perfromance hit of writing
>>     all messages to disk that is.
>>     > >
>>     > >  Fabian
>>     > >
>>     > > Satish Patel <satish.txt at gmail.com
>>     <mailto:satish.txt at gmail.com>> schrieb am Fr., 14. Aug. 2020, 16:59:
>>     > >
>>     > > > Fabian,
>>     > > >
>>     > > > what do you mean?
>>     > > >
>>     > > > > > I think vexxhost is running (1) with their
>>     openstack-operator - for
>>     > > >
>>     > > > reasons.
>>     > > >
>>     > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann
>>     <dev.faz at gmail.com <mailto:dev.faz at gmail.com>>
>>     > > > wrote:
>>     > > > >
>>     > > > > Hello again,
>>     > > > >
>>     > > > > just a short update about the results of my tests.
>>     > > > >
>>     > > > > I currently see 2 ways of running openstack+rabbitmq
>>     > > > >
>>     > > > > 1. without durable-queues and without replication - just one
>>     > > >
>>     > > > rabbitmq-process which gets (somehow) restarted if it fails.
>>     > > > > 2. durable-queues and replication
>>     > > > >
>>     > > > > Any other combination of these settings leads to more or
>>     less issues with
>>     > > > >
>>     > > > > * broken / non working bindings
>>     > > > > * broken queues
>>     > > > >
>>     > > > > I think vexxhost is running (1) with their
>>     openstack-operator - for
>>     > > >
>>     > > > reasons.
>>     > > > >
>>     > > > > I added [kolla], because kolla-ansible is installing
>>     rabbitmq with
>>     > > >
>>     > > > replication but without durable-queues.
>>     > > > >
>>     > > > > May someone point me to the best way to document these
>>     findings to some
>>     > > >
>>     > > > official doc?
>>     > > > > I think a lot of installations out there will run into
>>     issues if - under
>>     > > >
>>     > > > load - a node fails.
>>     > > > >
>>     > > > >  Fabian
>>     > > > >
>>     > > > >
>>     > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
>>     > > >
>>     > > > dev.faz at gmail.com <mailto:dev.faz at gmail.com>>:
>>     > > > > >
>>     > > > > > Hi,
>>     > > > > >
>>     > > > > > just did some short tests today in our test-environment
>>     (without
>>     > > >
>>     > > > durable queues and without replication):
>>     > > > > >
>>     > > > > > * started a rally task to generate some load
>>     > > > > > * kill-9-ed rabbitmq on one node
>>     > > > > > * rally task immediately stopped and the cloud (mostly)
>>     stopped working
>>     > > > > >
>>     > > > > > after some debugging i found (again) exchanges which had
>>     bindings to
>>     > > >
>>     > > > queues, but these bindings didnt forward any msgs.
>>     > > > > > Wrote a small script to detect these broken bindings and
>>     will now check
>>     > > >
>>     > > > if this is "reproducible"
>>     > > > > >
>>     > > > > > then I will try "durable queues" and "durable queues
>>     with replication"
>>     > > >
>>     > > > to see if this helps. Even if I would expect
>>     > > > > > rabbitmq should be able to handle this without these
>>     "hidden broken
>>     > > >
>>     > > > bindings"
>>     > > > > >
>>     > > > > > This just FYI.
>>     > > > > >
>>     > > > > >  Fabian
>>     >
>>



More information about the openstack-discuss mailing list