[nova][neutron][oslo][ops][kolla] rabbit bindings issue

Fabian Zimmermann dev.faz at gmail.com
Sun Aug 16 05:40:55 UTC 2020


Hi,

Already looked in Oslo.messaging, but rabbitmq is the only stable driver :(

Kafka is marked as experimental and (if the docs are correct) is only
usable for notifications.

Would love to switch to an alternate.

 Fabian

Satish Patel <satish.txt at gmail.com> schrieb am So., 16. Aug. 2020, 02:13:

> Hi Sean,
>
> Sounds good, but running rabbitmq for each service going to be little
> overhead also, how do you scale cluster (Yes we can use cellv2 but its
> not something everyone like to do because of complexity). If we thinks
> rabbitMQ is growing pain then why community not looking for
> alternative option (kafka) etc..?
>
> On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> >
> > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > Hi,
> > >
> > > i read somewhere that vexxhosts kubernetes openstack-Operator is
> running
> > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > used as "ha" for rabbitmq.
> > >
> > > That seems to match with my finding: run rabbitmq standalone and use an
> > > external system to restart rabbitmq if required.
> > thats the design that was orginally planned for kolla-kubernetes
> orrignally
> >
> > each service was to be deployed with its own rabbit mq server if it
> required one
> > and if it crashed it woudl just be recreated by k8s. it perfromace
> better then a cluster
> > and if you trust k8s or the external service enough to ensure it is
> recteated it
> > should be as effective a solution. you dont even need k8s to do that but
> it seams to be
> > a good fit if  your prepared to ocationally loose inflight rpcs.
> > if you not then you can configure rabbit to persite all message to disk
> and mont that on a shared
> > file system like nfs or cephfs so that when the rabbit instance is
> recreated the queue contency is
> > perserved. assuming you can take the perfromance hit of writing all
> messages to disk that is.
> > >
> > >  Fabian
> > >
> > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020,
> 16:59:
> > >
> > > > Fabian,
> > > >
> > > > what do you mean?
> > > >
> > > > > > I think vexxhost is running (1) with their openstack-operator -
> for
> > > >
> > > > reasons.
> > > >
> > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com
> >
> > > > wrote:
> > > > >
> > > > > Hello again,
> > > > >
> > > > > just a short update about the results of my tests.
> > > > >
> > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > >
> > > > > 1. without durable-queues and without replication - just one
> > > >
> > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > 2. durable-queues and replication
> > > > >
> > > > > Any other combination of these settings leads to more or less
> issues with
> > > > >
> > > > > * broken / non working bindings
> > > > > * broken queues
> > > > >
> > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > >
> > > > reasons.
> > > > >
> > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > >
> > > > replication but without durable-queues.
> > > > >
> > > > > May someone point me to the best way to document these findings to
> some
> > > >
> > > > official doc?
> > > > > I think a lot of installations out there will run into issues if -
> under
> > > >
> > > > load - a node fails.
> > > > >
> > > > >  Fabian
> > > > >
> > > > >
> > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > >
> > > > dev.faz at gmail.com>:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > just did some short tests today in our test-environment (without
> > > >
> > > > durable queues and without replication):
> > > > > >
> > > > > > * started a rally task to generate some load
> > > > > > * kill-9-ed rabbitmq on one node
> > > > > > * rally task immediately stopped and the cloud (mostly) stopped
> working
> > > > > >
> > > > > > after some debugging i found (again) exchanges which had
> bindings to
> > > >
> > > > queues, but these bindings didnt forward any msgs.
> > > > > > Wrote a small script to detect these broken bindings and will
> now check
> > > >
> > > > if this is "reproducible"
> > > > > >
> > > > > > then I will try "durable queues" and "durable queues with
> replication"
> > > >
> > > > to see if this helps. Even if I would expect
> > > > > > rabbitmq should be able to handle this without these "hidden
> broken
> > > >
> > > > bindings"
> > > > > >
> > > > > > This just FYI.
> > > > > >
> > > > > >  Fabian
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200816/1ded3f31/attachment.html>


More information about the openstack-discuss mailing list