[openstack-dev] [Nova][Cinder] Multi-attach, determining when to call os-brick's connector.disconnect_volume
Walter A. Boring IV
walter.boring at hpe.com
Thu Feb 11 17:31:29 UTC 2016
There seems to be a few discussions going on here wrt to detaches. One
is what to do on the Nova side with calling os-brick's
disconnect_volume, and also when to or not to call Cinder's
terminate_connection and detach.
My original post was simply to discuss a mechanism to try and figure out
the first problem. When should nova call brick to remove
the local volume, prior to calling Cinder to do something.
Nova needs to know if it's safe to call disconnect_volume or not. Cinder
already tracks each attachment, and it can return the connection_info
for each attachment with a call to initialize_connection. If 2 of
those connection_info dicts are the same, it's a shared volume/target.
Don't call disconnect_volume if there are any more of those left.
On the Cinder side of things, if terminate_connection, detach is called,
the volume manager can find the list of attachments for a volume, and
compare that to the attachments on a host. The problem is, Cinder
doesn't track the host along with the instance_uuid in the attachments
table. I plan on allowing that as an API change after microversions
lands, so we know how many times a volume is attached/used on a
particular host. The driver can decide what to do with it at
terminate_connection, detach time. This helps account for
the differences in each of the Cinder backends, which we will never get
all aligned to the same model. Each array/backend handles attachments
different and only the driver knows if it's safe to remove the target or
not, depending on how many attachments/usages it has
on the host itself. This is the same thing as a reference counter,
which we don't need, because we have the count in the attachments table,
once we allow setting the host and the instance_uuid at the same time.
> On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
>> Hey folks,
>> One of the challenges we have faced with the ability to attach a single
>> volume to multiple instances, is how to correctly detach that volume. The
>> issue is a bit complex, but I'll try and explain the problem, and then
>> describe one approach to solving one part of the detach puzzle.
>> When a volume is attached to multiple instances on the same host. There
>> are 2 scenarios here.
>> 1) Some Cinder drivers export a new target for every attachment on a
>> compute host. This means that you will get a new unique volume path on a
>> host, which is then handed off to the VM instance.
>> 2) Other Cinder drivers export a single target for all instances on a
>> compute host. This means that every instance on a single host, will reuse
>> the same host volume path.
> This problem isn't actually new. It is a problem we already have in Nova
> even with single attachments per volume. eg, with NFS and SMBFS there
> is a single mount setup on the host, which can serve up multiple volumes.
> We have to avoid unmounting that until no VM is using any volume provided
> by that mount point. Except we pretend the problem doesn't exist and just
> try to unmount every single time a VM stops, and rely on the kernel
> failing umout() with EBUSY. Except this has a race condition if one VM
> is stopping right as another VM is starting
> There is a patch up to try to solve this for SMBFS:
> but I don't really much like it, because it only solves it for one
> I think we need a general solution that solves the problem for all
> cases, including multi-attach.
> AFAICT, the only real answer here is to have nova record more info
> about volume attachments, so it can reliably decide when it is safe
> to release a connection on the host.
>> Proposed solution:
>> Nova needs to determine if the volume that's being detached is a shared or
>> non shared volume. Here is one way to determine that.
>> Every Cinder volume has a list of it's attachments. In those attachments
>> it contains the instance_uuid that the volume is attached to. I presume
>> Nova can find which of the volume attachments are on the same host. Then
>> Nova can call Cinder's initialize_connection for each of those attachments
>> to get the target's connection_info dictionary. This connection_info
>> dictionary describes how to connect to the target on the cinder backend. If
>> the target is shared, then each of the connection_info dicts for each
>> attachment on that host will be identical. Then Nova would know that it's a
>> shared target, and then only call os-brick's disconnect_volume, if it's the
>> last attachment on that host. I think at most 2 calls to cinder's
>> initialize_connection would suffice to determine if the volume is a shared
>> target. This would only need to be done if the volume is multi-attach
>> capable and if there are more than 1 attachments on the same host, where the
>> detach is happening.
> As above, we need to solve this more generally than just multi-attach,
> even single-attach is flawed today.
More information about the OpenStack-dev