[nova][neutron][oslo][ops][kolla] rabbit bindings issue
Ben Nemec
openstack at nemebean.com
Mon Aug 17 16:13:15 UTC 2020
On 8/16/20 3:48 AM, Tobias Urdin wrote:
> Hello,
>
> Kind of off topic but I’ve been starting doing some research to see if a
> KubeMQ driver could be added to oslo.messaging
You may want to take a look at
https://docs.openstack.org/oslo.messaging/latest/contributor/supported-messaging-drivers.html
We've had bad luck with adding new drivers to oslo.messaging in the
past, so we've tried to come up with a policy that gives them the best
possible chance of being successful. It does set a rather high bar for
integration though.
Also take a look at https://review.opendev.org/#/c/692784/ A lot of the
discussion there may be relevant to another new driver.
>
> Best regards
>
>> On 16 Aug 2020, at 07:44, Fabian Zimmermann <dev.faz at gmail.com> wrote:
>>
>>
>> Hi,
>>
>> Already looked in Oslo.messaging, but rabbitmq is the only stable
>> driver :(
>>
>> Kafka is marked as experimental and (if the docs are correct) is only
>> usable for notifications.
>>
>> Would love to switch to an alternate.
>>
>> Fabian
>>
>> Satish Patel <satish.txt at gmail.com <mailto:satish.txt at gmail.com>>
>> schrieb am So., 16. Aug. 2020, 02:13:
>>
>> Hi Sean,
>>
>> Sounds good, but running rabbitmq for each service going to be little
>> overhead also, how do you scale cluster (Yes we can use cellv2 but its
>> not something everyone like to do because of complexity). If we thinks
>> rabbitMQ is growing pain then why community not looking for
>> alternative option (kafka) etc..?
>>
>> On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com
>> <mailto:smooney at redhat.com>> wrote:
>> >
>> > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
>> > > Hi,
>> > >
>> > > i read somewhere that vexxhosts kubernetes openstack-Operator
>> is running
>> > > one rabbitmq Container per Service. Just the kubernetes self
>> healing is
>> > > used as "ha" for rabbitmq.
>> > >
>> > > That seems to match with my finding: run rabbitmq standalone
>> and use an
>> > > external system to restart rabbitmq if required.
>> > thats the design that was orginally planned for kolla-kubernetes
>> orrignally
>> >
>> > each service was to be deployed with its own rabbit mq server if
>> it required one
>> > and if it crashed it woudl just be recreated by k8s. it
>> perfromace better then a cluster
>> > and if you trust k8s or the external service enough to ensure it
>> is recteated it
>> > should be as effective a solution. you dont even need k8s to do
>> that but it seams to be
>> > a good fit if your prepared to ocationally loose inflight rpcs.
>> > if you not then you can configure rabbit to persite all message
>> to disk and mont that on a shared
>> > file system like nfs or cephfs so that when the rabbit instance
>> is recreated the queue contency is
>> > perserved. assuming you can take the perfromance hit of writing
>> all messages to disk that is.
>> > >
>> > > Fabian
>> > >
>> > > Satish Patel <satish.txt at gmail.com
>> <mailto:satish.txt at gmail.com>> schrieb am Fr., 14. Aug. 2020, 16:59:
>> > >
>> > > > Fabian,
>> > > >
>> > > > what do you mean?
>> > > >
>> > > > > > I think vexxhost is running (1) with their
>> openstack-operator - for
>> > > >
>> > > > reasons.
>> > > >
>> > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann
>> <dev.faz at gmail.com <mailto:dev.faz at gmail.com>>
>> > > > wrote:
>> > > > >
>> > > > > Hello again,
>> > > > >
>> > > > > just a short update about the results of my tests.
>> > > > >
>> > > > > I currently see 2 ways of running openstack+rabbitmq
>> > > > >
>> > > > > 1. without durable-queues and without replication - just one
>> > > >
>> > > > rabbitmq-process which gets (somehow) restarted if it fails.
>> > > > > 2. durable-queues and replication
>> > > > >
>> > > > > Any other combination of these settings leads to more or
>> less issues with
>> > > > >
>> > > > > * broken / non working bindings
>> > > > > * broken queues
>> > > > >
>> > > > > I think vexxhost is running (1) with their
>> openstack-operator - for
>> > > >
>> > > > reasons.
>> > > > >
>> > > > > I added [kolla], because kolla-ansible is installing
>> rabbitmq with
>> > > >
>> > > > replication but without durable-queues.
>> > > > >
>> > > > > May someone point me to the best way to document these
>> findings to some
>> > > >
>> > > > official doc?
>> > > > > I think a lot of installations out there will run into
>> issues if - under
>> > > >
>> > > > load - a node fails.
>> > > > >
>> > > > > Fabian
>> > > > >
>> > > > >
>> > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
>> > > >
>> > > > dev.faz at gmail.com <mailto:dev.faz at gmail.com>>:
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > just did some short tests today in our test-environment
>> (without
>> > > >
>> > > > durable queues and without replication):
>> > > > > >
>> > > > > > * started a rally task to generate some load
>> > > > > > * kill-9-ed rabbitmq on one node
>> > > > > > * rally task immediately stopped and the cloud (mostly)
>> stopped working
>> > > > > >
>> > > > > > after some debugging i found (again) exchanges which had
>> bindings to
>> > > >
>> > > > queues, but these bindings didnt forward any msgs.
>> > > > > > Wrote a small script to detect these broken bindings and
>> will now check
>> > > >
>> > > > if this is "reproducible"
>> > > > > >
>> > > > > > then I will try "durable queues" and "durable queues
>> with replication"
>> > > >
>> > > > to see if this helps. Even if I would expect
>> > > > > > rabbitmq should be able to handle this without these
>> "hidden broken
>> > > >
>> > > > bindings"
>> > > > > >
>> > > > > > This just FYI.
>> > > > > >
>> > > > > > Fabian
>> >
>>
More information about the openstack-discuss
mailing list