[Openstack] olso-messaging times out after reconnecting to rabbit
Gordon Sim
gsim at redhat.com
Tue Jul 8 11:30:54 UTC 2014
On 07/08/2014 02:00 AM, Noel Burton-Krahn wrote:
> The thing is, that produces errors exactly like what I'm seeing in nova
> if rabbit dies and we reconnect to a new rabbit instance.
A call timing out while waiting for a response is a fairly general
problem for which there could be different causes.
> I'm tracing
> through the nova calls in the rabbit reconnect case to confirm that
> acknowledge is always being called when it should be.
Even if it is, the acknowledgement could be lost if the connection to
rabbitmq fails. However I don't think that is likely to be the cause of
the time out. Unlike in the example, in a real oslo.messaging based
service the fact that the request is redelivered shouldn't be a problem.
The reply issued to it may be ignored or dropped, but the subsequent
requests will be processed.
I'm not completely clear on what the timing is in your original problem.
You say the timeout happens after a restart. Is it immediately after
(i.e. could some connections still be detecting the failure)? Or long
enough after that you are confident everything has failed over correctly?
(Obviously a failure or restart *during* a call may well result in a
timeout; that is the expected semantics at present).
More information about the Openstack
mailing list