[ironic] [oslo] ironic overloading notifications for internal messaging

Harald Jensås hjensas at redhat.com
Tue Feb 5 19:08:35 UTC 2019


On Tue, 2019-02-05 at 11:43 -0500, Ken Giusti wrote:
> On 2/4/19, Harald Jensås <hjensas at redhat.com> wrote:
> > 
> > I opened a oslo.messaging bug[1] yesterday. When using
> > notifications
> > and all consumers use one or more pools. The ironic-neutron-agent
> > does
> > use pools for all listeners in it's hash-ring member manager. And
> > the
> > result is that notifications are published to the 'ironic-neutron-
> > agent-heartbeat.info' queue and they are never consumed.
> > 
> 
> This is an issue with the design of the notification pool feature.
> 
> The Notification service is designed so notification events can be
> sent even though there may currently be no consumers.  It supports
> the
> ability for events to be queued until a consumer(s) is ready to
> process them.  So when a notifier issues an event and there are no
> consumers subscribed, a queue must be provisioned to hold that event
> until consumers appear.
> 
> For notification pools the pool identifier is supplied by the
> notification listener when it subscribes.  The value of any pool id
> is
> not known beforehand by the notifier, which is important because pool
> ids can be dynamically created by the listeners.  And in many cases
> pool ids are not even used.
> 
> So notifications are always published to a non-pooled queue.  If
> there
> are pooled subscriptions we rely on the broker to do the fanout.
> This means that the application should always have at least one
> non-pooled listener for the topic, since any events that may be
> published _before_ the listeners are established will be stored on a
> non-pooled queue.
> 

>From what I observer any message published _before_ or _after_ pool
listeners are established are stored on the non-pooled queue.

> The documentation doesn't make that clear AFAIKT - that needs to be
> fixed.
> 

I agree with your conclusion here. This is not clear in the
documentation. And it should be updated to reflect the requirement of
at least one non-pool listener to consume the non-pooled queue.


> > The second issue, each instance of the agent uses it's own pool to
> > ensure all agents are notified about the existance of peer-agents.
> > The
> > pools use a uuid that is generated at startup (and re-generated on
> > restart, stop/start etc). In the case where
> > `[oslo_messaging_rabbit]/amqp_auto_delete = false` in neutron
> > config
> > these uuid queues are not automatically removed. So after a restart
> > of
> > the ironic-neutron-agent the queue with the old UUID is left in the
> > message broker without no consumers, growing ...
> > 
> > 
> > I intend to push patches to fix both issues. As a workaround (or
> > the
> > permanent solution) will create another listener consuming the
> > notifications without a pool. This should fix the first issue.
> > 
> > Second change will set amqp_auto_delete for these specific queues
> > to
> > 'true' no matter. What I'm currently stuck on here is that I need
> > to
> > change the control_exchange for the transport. According to
> > oslo.messaging documentation it should be possible to override the
> > control_exchange in the transport_url[3]. The idea is to set
> > amqp_auto_delete and a ironic-neutron-agent specific exchange on
> > the
> > url when setting up the transport for notifications, but so far I
> > belive the doc string on the control_exchange option is wrong.
> > 
> 
> Yes the doc string is wrong - you can override the default
> control_exchange via the Target's exchange field:
> 
> 
https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo_messaging/target.py#n40
> 
> At least that's the intent...
> 
> ... however the Notifier API does not take a Target, it takes a list
> of topic _strings_:
> 
> 
https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo_messaging/notify/notifier.py#n239
> 
> Which seems wrong, especially since the notification Listener
> subscribes to a list of Targets:
> 
> 
https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo_messaging/notify/listener.py#n227
> 
> I've opened a bug for this and will provide a patch for review
> shortly:
> 
> https://bugs.launchpad.net/oslo.messaging/+bug/1814797
> 
> 

Thanks, this makes sense.


One question, in target I can see that there is the 'fanout' parameter.

https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo_messaging/target.py#n62

""" Clients may request that a copy of the message be delivered to all
servers listening on a topic by setting fanout to ``True``, rather than
just one of them. """

In my usecase I actually want exactly that. So once your patch lands I
can drop the use of pools and just set fanout=true on the target
instead?

> 
> 
> 
> > 
> > NOTE: The second issue can be worked around by stopping and
> > starting
> > rabbitmq as a dependency of the ironic-neutron-agent service. This
> > ensure only queues for active agent uuid's are present, and those
> > queues will be consumed.
> > 
> > 
> > --
> > Harald Jensås
> > 
> > 
> > [1] https://bugs.launchpad.net/oslo.messaging/+bug/1814544
> > [2] https://storyboard.openstack.org/#!/story/2004933
> > [3]
> > 
https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/transport.py#L58-L62
> > 
> > 
> > 
> 
> 




More information about the openstack-discuss mailing list