Open Stack

Tue Nov 20 21:46:22 UTC 2012

The topic exchange is what causes the behavior I noted. If no one is
listening on a topic then it doesn't have a route, therefore throw it away
with no error. I'm not familiar with notifier, but if it addresses a node
topic then it should work just as well with a node queue.

I will do some more research into what expectations we have around our
communication, but I think at least I am going to try to convert our
installation to use queues for node communication and see how it goes. I'll
probably have something interesting to say about it in a few days :-).

Mike Wilson
Bluehost.com

On Tue, Nov 20, 2012 at 8:12 AM, Russell Bryant <rbryant at redhat.com> wrote:

> On 11/20/2012 12:23 AM, Mike Wilson wrote:
> > Hey folks,
> >
> > I've been spending some time with qpid recently investigating a bug
> > where compute nodes will randomly loose their binding to their
> > compute.hostname topics. When this happens, starting new instances,
> > deleting and lots of other functionality which is addressed directly to
> > the compute node topic silently fail. Anything that is a "cast" instead
> > of a "call" just fails, no errors, no logging, etc. This is because the
> > message goes to the exchange but since there is no one listening on the
> > compute topic it is silently dropped. Apparently there are ways to deal
> > with this setting up a DLQ, also the AMQP spec is built to error out
> > when this happens if certain flags are set, see the following for more
> info:
> >
> >
> http://qpid.2158936.n2.nabble.com/How-to-know-when-a-message-could-not-be-enqueued-td3751016.html#a3751626
> >
> > In any case, I'm still not quite set on how I will handle this, I'm
> > leaning towards implementing the discard-unroutable property in qpid and
> > handling the exception in the sender. But I'm still not sure that is the
> > best way to go about it. I'm considering using queues as an alternative
> > to communicate with nodes.  They are fairly persistent so if there isn't
> > a receiver on the line when we send the message they could pick it up
> > later. I'm looking for some feedback from the community on this as I
> > would like whatever work I'm doing to make it upstream. Thx in advance.
>
> We should start by defining what behavior we want.  I agree with what
> you say here at the end.  Ideally when a message is sent to 'compute' or
> 'compute.<node>' but nothing is currently listening, we want that
> message to be queued up and waiting for a compute node to come back
> alive and handle it.  (We should be setting a TTL on all messages to
> ensure that they don't stay in a queue for forever, but we're not doing
> that yet.)
>
> Is the fact that it's a topic exchange messing this up?  AFAIK, nothing
> makes use of the fact that these are topic exchanges, except maybe
> notifications (rpc_notifier), so we need to watch out for that.
>
> For the 'compute' and 'compute.<node>' style queues used by all of the
> services, I believe queues on a direct exchange would work just fine for
> the semantics we care about.
>
> --
> Russell Bryant
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20121120/a27afa9e/attachment.html>

Open Stack

[openstack-dev] Why topics instead of queues to communicate with compute nodes?

OpenStack

Community

Documentation

Branding & Legal