Hey folks,<div><br></div><div>I've been spending some time with qpid recently investigating a bug where compute nodes will randomly loose their binding to their compute.hostname topics. When this happens, starting new instances, deleting and lots of other functionality which is addressed directly to the compute node topic silently fail. Anything that is a "cast" instead of a "call" just fails, no errors, no logging, etc. This is because the message goes to the exchange but since there is no one listening on the compute topic it is silently dropped. Apparently there are ways to deal with this setting up a DLQ, also the AMQP spec is built to error out when this happens if certain flags are set, see the following for more info:</div>
<div><br></div><div><a href="http://qpid.2158936.n2.nabble.com/How-to-know-when-a-message-could-not-be-enqueued-td3751016.html#a3751626">http://qpid.2158936.n2.nabble.com/How-to-know-when-a-message-could-not-be-enqueued-td3751016.html#a3751626</a><br>
</div><div><br></div><div>In any case, I'm still not quite set on how I will handle this, I'm leaning towards implementing the discard-unroutable property in qpid and handling the exception in the sender. But I'm still not sure that is the best way to go about it. I'm considering using queues as an alternative to communicate with nodes. They are fairly persistent so if there isn't a receiver on the line when we send the message they could pick it up later. I'm looking for some feedback from the community on this as I would like whatever work I'm doing to make it upstream. Thx in advance.</div>
<div><br></div><div>Mike Wilson</div><div>Bluehost.com</div><div><br></div><div><br></div>