[openstack-dev] [nova][cinder] what are the key errors with volume detach

Andrea Rosa andrea.rosa at hpe.com
Mon Dec 14 17:24:19 UTC 2015



On 10/12/15 14:42, Sean Dague wrote:
> On 12/02/2015 12:37 PM, Rosa, Andrea (HP Cloud Services) wrote:
>> Hi
>>
>> thanks Sean for bringing this point, I have been working on the change and on the (abandoned) spec.
>> I'll try here to summarize all the discussions we had and what we decided.
>>
>>> From: Sean Dague [mailto:sean at dague.net]
>>> Sent: 02 December 2015 13:31
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: [openstack-dev] [nova] what are the key errors with volume detach
>>>
>>> This patch to add a bunch of logic to nova-manage for forcing volume detach
>>> raised a bunch of questions
>>> https://review.openstack.org/#/c/184537/24/nova/cmd/manage.py,cm
>>
>> On this specific review there are some valid concerns that I am happy to address, but first we need to understand if we want this change.
>> FWIW I think it is still a valid change, please see below.
>>
>>> In thinking about this for the last day, I think the real concern is that we have
>>> so many safety checks on volume delete, that if we failed with a partially
>>> setup volume, we have too many safety latches to tear it down again.
>>>
>>> Do we have some detailed bugs about how that happens? Is it possible to
>>> just fix DELETE to work correctly even when we're in these odd states?
>>
>> In a simplified view of a detach volume we can say that the nova code does:
>> 1 detach the volume from the instance
>> 2 Inform cinder about the detach and call the terminate_connection on the cinder API. 
>> 3 delete the dbm recod in the nova DB
>>
>> If 2 fails the volumes get stuck in a detaching status and any further attempt to delete or detach the volume will fail:
>> "Delete for volume <volume_id> failed: Volume <volume_id> is still attached, detach volume first. (HTTP 400)"
> 
> So why isn't this handled in a "finally" pattern.
> 
> Ensure that you always do 2 (a) & (b) and 3, collect errors that happen
> during 2 (a) & (b), report them back to the user.
> What state does that leave things in? Both from the server and the volume.
> 

The detach volume in cinder (2.a 2.b) is an async call, if Nova can talk
to the Cinder API it sends the request and if the detach on the Cinder
side fails Nova doesn't know about it.
--
Andrea Rosa





More information about the OpenStack-dev mailing list