Hi Matt, thanks for your reply first.

The log I paste is from nova-compute.
And I also check cinder-api & cinder-volume logs according from timestamp. Strange is, no error messages found during that time.
I remember I launch evacuation on the host.

Perhaps it's over-loading but it's not happening on the cinder. Because the environment is 3 all-in-one installation model.
That means control+compute per node, and 3 nodes become controller HA.
When I shutdown one of the node, I found all requests from API is pretty slow (can feel that when using dashboard.)
And back to normal again when the node is back.

I'll try do the evacuation again but with just disable nova host or stop nova services, to test if that happen again or not.

Matt Riedemann <mriedemos@gmail.com> 於 2019年7月23日 週二 上午6:40寫道:
On 7/18/2019 3:53 AM, Eddie Yen wrote:
> Before I try to evacuate host, the source host had about 24 VMs running.
> When I shutdown the node and execute evacuation, there're few VMs
> failed. The error code is 504.
> Strange is those VMs are all attach its own volume.
>
> Then I check nova-compute log, a detailed error has pasted at below link;
> https://pastebin.com/uaE7YrP1
>
> Does anyone have any experience with this? I googled but no enough
> information about this.

Are there errors in the cinder-api logs during the evacuate of all VMs
from the host? Are you doing the evacuate operation on all VMs on the
host concurrently or in serial? I wonder if you're over-loading cinder
and that's causing the timeout somehow. The timeout from cinder is when
deleting volume attachment records, which would be terminating
connections in the storage backend under the covers. Check the
cinder-volume logs for errors as well.

--

Thanks,

Matt