On Thu, 18 Jul 2019 at 09:54, Eddie Yen <missile0407@gmail.com> wrote:

Hi everyone, I met an issue when try to evacuate host.
The platform is stable/rocky and using kolla-ansible to deploy.
And all storage backends are connected to Ceph.

Before I try to evacuate host, the source host had about 24 VMs running.
When I shutdown the node and execute evacuation, there're few VMs failed. The error code is 504.
Strange is those VMs are all attach its own volume.

Then I check nova-compute log, a detailed error has pasted at below link;
https://pastebin.com/uaE7YrP1

Does anyone have any experience with this? I googled but no enough information about this.

Thanks!

Gateway timeout suggests the server timeout in haproxy is too low, and the server (cinder-api) has not responded to the request in time. The default timeout is 60s, and is configured via haproxy_server_timeout (and possibly haproxy_client_timeout). You could try increasing this in globals.yml.

We do use a larger timeout for glance-api (haproxy_glance_api_client_timeout and haproxy_glance_api_server_timeout, both 6h). Perhaps we need something similar for cinder-api.