<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, May 13, 2013 at 2:23 PM, Vishvananda Ishaya <span dir="ltr"><<a href="mailto:vishvananda@gmail.com" target="_blank">vishvananda@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi everyone,<br>
<br>
There have been efforts in cinder to support migration of volumes:<br>
<br>
<a href="https://blueprints.launchpad.net/cinder/+spec/volume-migration" target="_blank">https://blueprints.launchpad.net/cinder/+spec/volume-migration</a><br>
<br>
This is especially important for maintenance scenarios. While some storage backends support migration internally, our default backend is iscsi/lvm based. In this model a volume resides on exactly one host. We need a way to perform maintenance in our default config. Picture a scenario like the following:<br>
<br>
An instance is running on HOST A and has attached a volume from HOST B. the instance is actively reading and writing to the volume. HOST C is empty.<br>
<br>
+--------+ +--------+ +--------+<br>
| HOST A | | HOST B | | HOST C |<br>
+--------+ +--------+ +--------+<br>
|instance|<-attached-| volume | | |<br>
+--------+ +--------+ +--------+<br>
<br>
If an administrator needs to take HOST A down for maintenance, she could ininitiate a live-migration of the instance from host a to host C, and the instance could continue running without downtime. If, on the other hand, host B needs to be taken down, there is no way to keep the system up and running. She would have to pause the instance on HOST A, detach the volume, copy the data to HOST C, then reattach and resume the instance.<br>
</blockquote><div style><br></div><div style>I agree with the problem statement and the issue with taking down Host-B except rather than detach and copy I was hoping to just do a snap and use that for the copy while the original volume was live. The trick is syncing the diffs that were written while the snap was being copied over, more on this in response to the idea of the generalized fall-back approach below.</div>
<div style><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
This clearly is not ideal.<br>
<br>
There was a proposal in nova to allow for volumes to be migrated along with a live/block migration:<br>
<br>
<a href="https://blueprints.launchpad.net/nova/+spec/migrate-volume-block-migration" target="_blank">https://blueprints.launchpad.net/nova/+spec/migrate-volume-block-migration</a><br>
<br>
Libvirt does support switching volumes during a migration, but I feel this is overcomplicated. It would require migrating the instance AND the volume to HOST C even if we only need to do maintenance on HOST B.<br>
<br>
I have a POC for an alternative proposal here:<br>
<br>
<a href="https://review.openstack.org/#/c/28995" target="_blank">https://review.openstack.org/#/c/28995</a><br>
<br>
The general idea is to allow a command that swaps an existing/mounted volume out for another volume transparently to the guest. All data is copied to the new volume during the swap. There are a few concerns with this approach, so I wanted to get it out there to be looked at by other people before putting together a blueprint/cleaning it up/etc.<br>
<br>
The major concern I have is that this transparent fast swap has some limitations:<br>
a) the instance must be running<br>
b) it only works for iscsi volumes (I have not tested to see if it is even possible for things like rbd volumes)<br>
c) only recent versions of libvirt/qemu support the feature.<br>
d) it is unlikely that other hypervisors can implement this feature easily.<br></blockquote><div style><br></div><div style>Points 'c' and 'd' are the main concerns out of these in my opinion, points 'a' and 'b' can be addressed via a generalized abstraction I think, but I still haven't figured out how to solve the issue mentioned above regarding the changes to disk during the initial copy/clone. One thought I had was:</div>
<div style>snapshot vol-a --> copy snapshot vol-b (this might be an attach/dd combo or something else) --> raw device rsync vol-a to vol-b --> yet to be discovered magic to detach/attach </div><div style><br></div>
<div style>The good thing is we can have the functionality even if it's not as sexy as the blockmigrate implementation.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
For this reason the best approach might be to consider this to be a fast-path version of swap and we could have a slower/fallback approach which would be:<br>
1. pause the vm<br>
2. connect to the new volume<br>
3. copy data to the new volume<br>
4. swap the volumes out<br>
5. unpause the vm<br>
<br></blockquote><div style>I think this is reasonable, and I think even this slower path mentioned can be optimized as well (whether on initial implementation or via enhancements in future patches). </div><div style> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I believe this is implementable in all hypervisors. So we can support the swap feature but it may incur a (potentially long) pause with some hypervisors/machine states/backend volumes.<br>
<br>
Now for questions:<br>
A) Does the fast swap with fallback seem like a reasnable approach?<br></blockquote><div style><br></div><div style>Personally I think for a number of use cases this is a good approach. I'd even be interested in having a tested/recommended config that has the mirroring pre-set across nodes so we wouldn't even need to take that initial mirroring update hit. If folks are interested enough in live migration it might be worth pointing out an optimized config for doing this.</div>
<div style> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
B) Is this feature useful enough to justify the effort?<br></blockquote><div><br></div><div style>In my opinion yes</div><div style> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
C) Does anyone have alternate proposals?<br></blockquote><div><br></div><div style>Some ideas were kicked around at the summit regarding the use of a migration service running in Cinder that did this sort of thing on a Cinder node, it would use things like backend implementations where applicable, and generic dd type operations for back-end to back-end. For the specific use case you mention here I'm not sure it's better/worse etc (well it's probably worse :) ). It may even be that the migration service approach ends up being the fall-back case.</div>
<div style><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Thanks,<br>
Vish<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</blockquote></div></div><div class="gmail_extra" style><br></div><div class="gmail_extra" style>I do have some other use cases I'd like to address and look at. They may be different enough that they don't impact this particular proposal, but something I think would be useful is something like importing volumes in to Cinder, and export/importing cinder volumes from one cloud installation to another. I can elaborate more on these later but I don't think they're directly related to this proposal and I don't want to run down a rabbit hole talking about those things here. I just wanted to at least have the concept in mind as we look at this idea.</div>
<div class="gmail_extra" style><br></div><div class="gmail_extra" style>Anyway, looking forward to hear what others think....</div><div class="gmail_extra" style><br></div><div class="gmail_extra" style>John</div><div class="gmail_extra" style>
<br></div></div>