<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 11, 2016 at 10:31 AM, Walter A. Boring IV <span dir="ltr"><<a href="mailto:walter.boring@hpe.com" target="_blank">walter.boring@hpe.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">There seems to be a few discussions going on here wrt to detaches. One is what to do on the Nova side with calling os-brick's disconnect_volume, and also when to or not to call Cinder's terminate_connection and detach.<br>
<br>
My original post was simply to discuss a mechanism to try and figure out the first problem. When should nova call brick to remove<br>
the local volume, prior to calling Cinder to do something.<div class="gmail_default" style="font-family:monospace,monospace;display:inline"></div> </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Nova needs to know if it's safe to call disconnect_volume or not. Cinder already tracks each attachment, and it can return the connection_info for each attachment with a call to initialize_connection. If 2 of those connection_info dicts are the same, it's a shared volume/target. Don't call disconnect_volume if there are any more of those left.<br>
<br>
On the Cinder side of things, if terminate_connection, detach is called, the volume manager can find the list of attachments for a volume, and compare that to the attachments on a host. The problem is, Cinder doesn't track the host along with the instance_uuid in the attachments table. I plan on allowing that as an API change after microversions lands, so we know how many times a volume is attached/used on a particular host. The driver can decide what to do with it at terminate_connection, detach time. This helps account for<br>
the differences in each of the Cinder backends, which we will never get all aligned to the same model. Each array/backend handles attachments different and only the driver knows if it's safe to remove the target or not, depending on how many attachments/usages it has<br>
on the host itself. This is the same thing as a reference counter, which we don't need, because we have the count in the attachments table, once we allow setting the host and the instance_uuid at the same time.<br>
<br></blockquote><div><div class="gmail_default" style="font-family:monospace,monospace;display:inline">Not trying to drag this out or be difficult I promise. But, this seems like it is in fact the same problem, and I'm not exactly following; if you store the info on the compute side during the attach phase, why would you need/want to then create a split brain scenario and have Cinder do any sort of tracking on the detach side of things?</div></div><div><div class="gmail_default" style="font-family:monospace,monospace;display:inline"><br></div></div><div><div class="gmail_default" style="font-family:monospace,monospace;display:inline">Like the earlier posts said, just don't call terminate_connection if you don't want to really terminate the connection? I'm sorry, I'm just not following the logic of why Cinder should track this and interfere with things? It's supposed to be providing a service to consumers and "do what it's told" even if it's told to do the wrong thing.</div></div><div><div class="gmail_default" style="font-family:monospace,monospace;display:inline"> </div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Walt<div class="HOEnZb"><div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hey folks,<br>
One of the challenges we have faced with the ability to attach a single<br>
volume to multiple instances, is how to correctly detach that volume. The<br>
issue is a bit complex, but I'll try and explain the problem, and then<br>
describe one approach to solving one part of the detach puzzle.<br>
<br>
Problem:<br>
When a volume is attached to multiple instances on the same host. There<br>
are 2 scenarios here.<br>
<br>
1) Some Cinder drivers export a new target for every attachment on a<br>
compute host. This means that you will get a new unique volume path on a<br>
host, which is then handed off to the VM instance.<br>
<br>
2) Other Cinder drivers export a single target for all instances on a<br>
compute host. This means that every instance on a single host, will reuse<br>
the same host volume path.<br>
</blockquote>
<br>
This problem isn't actually new. It is a problem we already have in Nova<br>
even with single attachments per volume. eg, with NFS and SMBFS there<br>
is a single mount setup on the host, which can serve up multiple volumes.<br>
We have to avoid unmounting that until no VM is using any volume provided<br>
by that mount point. Except we pretend the problem doesn't exist and just<br>
try to unmount every single time a VM stops, and rely on the kernel<br>
failing umout() with EBUSY. Except this has a race condition if one VM<br>
is stopping right as another VM is starting<br>
<br>
There is a patch up to try to solve this for SMBFS:<br>
<br>
<a href="https://review.openstack.org/#/c/187619/" rel="noreferrer" target="_blank">https://review.openstack.org/#/c/187619/</a><br>
<br>
but I don't really much like it, because it only solves it for one<br>
driver.<br>
<br>
I think we need a general solution that solves the problem for all<br>
cases, including multi-attach.<br>
<br>
AFAICT, the only real answer here is to have nova record more info<br>
about volume attachments, so it can reliably decide when it is safe<br>
to release a connection on the host.<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Proposed solution:<br>
Nova needs to determine if the volume that's being detached is a shared or<br>
non shared volume. Here is one way to determine that.<br>
<br>
Every Cinder volume has a list of it's attachments. In those attachments<br>
it contains the instance_uuid that the volume is attached to. I presume<br>
Nova can find which of the volume attachments are on the same host. Then<br>
Nova can call Cinder's initialize_connection for each of those attachments<br>
to get the target's connection_info dictionary. This connection_info<br>
dictionary describes how to connect to the target on the cinder backend. If<br>
the target is shared, then each of the connection_info dicts for each<br>
attachment on that host will be identical. Then Nova would know that it's a<br>
shared target, and then only call os-brick's disconnect_volume, if it's the<br>
last attachment on that host. I think at most 2 calls to cinder's<br>
initialize_connection would suffice to determine if the volume is a shared<br>
target. This would only need to be done if the volume is multi-attach<br>
capable and if there are more than 1 attachments on the same host, where the<br>
detach is happening.<br>
</blockquote>
As above, we need to solve this more generally than just multi-attach,<br>
even single-attach is flawed today.<br>
<br>
Regards,<br>
Daniel<br>
</blockquote>
<br>
<br></div></div><div class="HOEnZb"><div class="h5">
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</div></div></blockquote></div><br></div></div>