Perhaps, but not so fast. I still need more investigation.

Mark Goddard <mark@stackhpc.com> 於 2019年8月8日 週四 下午10:36寫道:


On Thu, 8 Aug 2019 at 11:39, Eddie Yen <missile0407@gmail.com> wrote:
Hi Mark, thanks for suggestion.

I think this, too. Cinder-api may normal but HAproxy could be very busy since one controller down.
I'll try to increase the value about cinder-api timeout.

Will you be proposing this fix upstream?

Mark Goddard <mark@stackhpc.com> 於 2019年8月7日 週三 上午12:06寫道:


On Tue, 6 Aug 2019 at 16:33, Matt Riedemann <mriedemos@gmail.com> wrote:
On 8/6/2019 7:18 AM, Mark Goddard wrote:
> We do use a larger timeout for glance-api
> (haproxy_glance_api_client_timeout
> and haproxy_glance_api_server_timeout, both 6h). Perhaps we need
> something similar for cinder-api.

A 6 hour timeout for cinder API calls would be nuts IMO. The thing that
was failing was a volume attachment delete/create from what I recall,
which is the newer version (as of Ocata?) for the old
initialize_connection/terminate_connection APIs. These are synchronous
RPC calls from cinder-api to cinder-volume to do things on the storage
backend and we have seen them take longer than 60 seconds in the gate CI
runs with the lvm driver. I think the investigation normally turned up
lvchange taking over 60 seconds on some concurrent operation locking out
the RPC call which eventually results in the MessagingTimeout from
oslo.messaging. That's unrelated to your gateway timeout from HAProxy
but the point is yeah you likely want to bump up those timeouts since
cinder-api has these synchronous calls to the cinder-volume service. I
just don't think you need to go to 6 hours :). I think the keystoneauth1
default http response timeout is 10 minutes so maybe try that.


Yeah, wasn't advocating for 6 hours - just showing which knobs are available :)
 
--

Thanks,

Matt