Dear all, Lately, one of our clients stored 300k files in a manila cephfs share. Then he deleted the share in Manila. This event make the driver unresponsive for several hours until all the data was removed in the cluster. We had a quick look at the code in manila [1] and the deletion is done first by calling the following api calls in the ceph bindings (delete_volume[1] and then purge_volume[2]). The first call moves the directory to a volumes_deleted directory. The second call does a deletion in depth of all the contents of that directory. The last operation is the one that trigger the issue. We had a similar issue in the past in Cinder. There, Arne proposed to do a deferred deletion of volumes. I think we could do the same in Manila for the cephfs driver. The idea is to continue to call to the delete_volume. And then inside a periodic task in the driver, asynchronously it will get the contents of that directory and trigger the purge command. I can propose the change and contribute with the code, but before going to deep I would like to know if there is a reason of having a singleton for the volume_client connection. If I compare with cinder code the connection is established and closed in each operation with the backend. If you are not the maintainer, could you please point me to he/she? I can post it in the mailing list if you prefer Cheers Jose Castro Leon CERN Cloud Infrastructure [1] https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/... [2] https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L7... [2] https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L7... PS: The issue was triggered by one of our clients in kubernetes using the Manila CSI driver