[openstack-dev] [nova][cinder] Migrating Volumes for Maintenance

Vishvananda Ishaya vishvananda at gmail.com
Mon May 13 20:23:06 UTC 2013


Hi everyone,

There have been efforts in cinder to support migration of volumes:

https://blueprints.launchpad.net/cinder/+spec/volume-migration

This is especially important for maintenance scenarios. While some storage backends support migration internally, our default backend is iscsi/lvm based. In this model a volume resides on exactly one host. We need a way to perform maintenance in our default config. Picture a scenario like the following:

An instance is running on HOST A and has attached a volume from HOST B. the instance is actively reading and writing to the volume. HOST C is empty.

+--------+           +--------+       +--------+
| HOST A |           | HOST B |       | HOST C |
+--------+           +--------+       +--------+
|instance|<-attached-| volume |       |        |
+--------+           +--------+       +--------+

If an administrator needs to take HOST A down for maintenance, she could ininitiate a live-migration of the instance from host a to host C, and the instance could continue running without downtime. If, on the other hand, host B needs to be taken down, there is no way to keep the system up and running. She would have to pause the instance on HOST A, detach the volume, copy the data to HOST C, then reattach and resume the instance.

This clearly is not ideal.

There was a proposal in nova to allow for volumes to be migrated along with a live/block migration:

https://blueprints.launchpad.net/nova/+spec/migrate-volume-block-migration

Libvirt does support switching volumes during a migration, but I feel this is overcomplicated. It would require migrating the instance AND the volume to HOST C even if we only need to do maintenance on HOST B.

I have a POC for an alternative proposal here:

https://review.openstack.org/#/c/28995

The general idea is to allow a command that swaps an existing/mounted volume out for another volume transparently to the guest. All data is copied to the new volume during the swap. There are a few concerns with this approach, so I wanted to get it out there to be looked at by other people before putting together a blueprint/cleaning it up/etc.

The major concern I have is that this transparent fast swap has some limitations:
 a) the instance must be running
 b) it only works for iscsi volumes (I have not tested to see if it is even possible for things like rbd volumes)
 c) only recent versions of libvirt/qemu support the feature.
 d) it is unlikely that other hypervisors can implement this feature easily.

For this reason the best approach might be to consider this to be a fast-path version of swap and we could have a slower/fallback approach which would be:
 1. pause the vm
 2. connect to the new volume
 3. copy data to the new volume
 4. swap the volumes out
 5. unpause the vm

I believe this is implementable in all hypervisors. So we can support the swap feature but it may incur a (potentially long) pause with some hypervisors/machine states/backend volumes.

Now for questions:
 A) Does the fast swap with fallback seem like a reasnable approach?
 B) Is this feature useful enough to justify the effort?
 C) Does anyone have alternate proposals?

Thanks,
Vish


More information about the OpenStack-dev mailing list