Hello everybody,
after some investigation in the RabbitMQ problems we found some
duplicated messages and timeouts in logs. Restarting the whole RabbitMQ
cluster (stop all rabbitmq containers and start one by one) solved the
problems for now.
The main cause for this issue seams to by the nova notifications
configuration with was deployed by kolla-ansible. If searchlight is not
installed the 'notifications/notification_format' should be
'unversioned'. Default is 'both' so nova will send a notification to the
queue versioned_notifications with has no consumer. In our case the
queue got huge amount of messages with made the rabbitmq cluster more
and more unstable, see: https://bugzilla.redhat.com/show_bug.cgi?id=1592528
Following settings in nova.conf may solve this issue but we didn`t
tested this yet:
[notification]
notification_format = unversioned
BR
Pawel