On 12/17/2015 8:51 AM, Andrea Rosa wrote: > >>> The communication with cinder is async, Nova doesn't wait or check if >>> the detach on cinder side has been executed correctly. >> >> Yeah, I guess nova gets the 202 back: >> >> http://logs.openstack.org/18/258118/2/check/gate-tempest-dsvm-full-ceph/7a5290d/logs/screen-n-cpu.txt.gz#_2015-12-16_03_30_43_990 >> >> >> Should nova be waiting for detach to complete before it tries deleting >> the volume (in the case that delete_on_termination=True in the bdm)? >> >> Should nova be waiting (regardless of volume delete) for the volume >> detach to complete - or timeout and fail the instance delete if it doesn't? > > I'll revisit this change next year trying to look at the problem in a > different way. > Thank you all for your time and all the suggestions. > -- > Andrea Rosa > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > I had a quick discussion with hemna this morning and he confirmed that nova should be waiting for os-detach to complete before we try to delete the volume, because if the volume status isn't 'available' the delete will fail. Also, if nova is hitting a failure to delete the volume it's swallowing it by passing raise_exc=False to _cleanup_volumes here [1]. Then we go on our merry way and delete the bdms in the nova database [2]. But I'd think at that point we're orphaning volumes in cinder that think they are still attached. If this is passing today it's probably just luck that we're getting the volume detached fast enough before we try to delete it. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2425-L2426 [2] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L909 -- Thanks, Matt Riedemann