[openstack-dev] [nova][cinder] volumes stuck detaching attaching and force detach

Matt Riedemann mriedem at linux.vnet.ibm.com
Mon Feb 29 14:48:43 UTC 2016



On 2/22/2016 4:08 PM, Walter A. Boring IV wrote:
> On 02/22/2016 11:24 AM, John Garbutt wrote:
>> Hi,
>>
>> Just came up on IRC, when nova-compute gets killed half way through a
>> volume attach (i.e. no graceful shutdown), things get stuck in a bad
>> state, like volumes stuck in the attaching state.
>>
>> This looks like a new addition to this conversation:
>> http://lists.openstack.org/pipermail/openstack-dev/2015-December/082683.html
>>
>> And brings us back to this discussion:
>> https://blueprints.launchpad.net/nova/+spec/add-force-detach-to-nova
>>
>> What if we move our attention towards automatically recovering from
>> the above issue? I am wondering if we can look at making our usually
>> recovery code deal with the above situation:
>> https://github.com/openstack/nova/blob/834b5a9e3a4f8c6ee2e3387845fc24c79f4bf615/nova/compute/manager.py#L934
>>
>>
>> Did we get the Cinder APIs in place that enable the force-detach? I
>> think we did and it was this one?
>> https://blueprints.launchpad.net/python-cinderclient/+spec/nova-force-detach-needs-cinderclient-api
>>
>>
>> I think diablo_rojo might be able to help dig for any bugs we have
>> related to this. I just wanted to get this idea out there before I
>> head out.
>>
>> Thanks,
>> John
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> .
>>
> The problem is a little more complicated.
>
> In order for cinder backends to be able to do a force detach correctly,
> the Cinder driver needs to have the correct 'connector' dictionary
> passed in to terminate_connection.  That connector dictionary is the
> collection of initiator side information which is gleaned here:
> https://github.com/openstack/os-brick/blob/master/os_brick/initiator/connector.py#L99-L144
>
>
> The plan was to save that connector information in the Cinder
> volume_attachment table.  When a force detach is called, Cinder has the
> existing connector saved if Nova doesn't have it.  The problem was live
> migration.  When you migrate to the destination n-cpu host, the
> connector that Cinder had is now out of date.  There is no API in Cinder
> today to allow updating an existing attachment.
>
> So, the plan at the Mitaka summit was to add this new API, but it
> required microversions to land, which we still don't have in Cinder's
> API today.
>
>
> Walt
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

Regarding storing off the initial connector information from the attach, 
does this [1] help bridge the gap? That adds the connector dict to the 
connection_info dict that is serialized and stored in the nova 
block_device_mappings table, and then in that patch is used to pass it 
to terminate_connection in the case that the host has changed.

[1] https://review.openstack.org/#/c/266095/

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list