[openstack-dev] [nova][cinder] Disabling nova volume-update (aka swap volume; aka cinder live migration)
Gorka Eguileor
geguileo at redhat.com
Thu Aug 23 09:31:43 UTC 2018
On 22/08, Matthew Booth wrote:
> On Wed, 22 Aug 2018 at 10:47, Gorka Eguileor <geguileo at redhat.com> wrote:
> >
> > On 20/08, Matthew Booth wrote:
> > > For those who aren't familiar with it, nova's volume-update (also
> > > called swap volume by nova devs) is the nova part of the
> > > implementation of cinder's live migration (also called retype).
> > > Volume-update is essentially an internal cinder<->nova api, but as
> > > that's not a thing it's also unfortunately exposed to users. Some
> > > users have found it and are using it, but because it's essentially an
> > > internal cinder<->nova api it breaks pretty easily if you don't treat
> > > it like a special snowflake. It looks like we've finally found a way
> > > it's broken for non-cinder callers that we can't fix, even with a
> > > dirty hack.
> > >
> > > volume-update <server> <old> <new> essentially does a live copy of the
> > > data on <old> volume to <new> volume, then seamlessly swaps the
> > > attachment to <server> from <old> to <new>. The guest OS on <server>
> > > will not notice anything at all as the hypervisor swaps the storage
> > > backing an attached volume underneath it.
> > >
> > > When called by cinder, as intended, cinder does some post-operation
> > > cleanup such that <old> is deleted and <new> inherits the same
> > > volume_id; that is <old> effectively becomes <new>. When called any
> > > other way, however, this cleanup doesn't happen, which breaks a bunch
> > > of assumptions. One of these is that a disk's serial number is the
> > > same as the attached volume_id. Disk serial number, in KVM at least,
> > > is immutable, so can't be updated during volume-update. This is fine
> > > if we were called via cinder, because the cinder cleanup means the
> > > volume_id stays the same. If called any other way, however, they no
> > > longer match, at least until a hard reboot when it will be reset to
> > > the new volume_id. It turns out this breaks live migration, but
> > > probably other things too. We can't think of a workaround.
> > >
> > > I wondered why users would want to do this anyway. It turns out that
> > > sometimes cinder won't let you migrate a volume, but nova
> > > volume-update doesn't do those checks (as they're specific to cinder
> > > internals, none of nova's business, and duplicating them would be
> > > fragile, so we're not adding them!). Specifically we know that cinder
> > > won't let you migrate a volume with snapshots. There may be other
> > > reasons. If cinder won't let you migrate your volume, you can still
> > > move your data by using nova's volume-update, even though you'll end
> > > up with a new volume on the destination, and a slightly broken
> > > instance. Apparently the former is a trade-off worth making, but the
> > > latter has been reported as a bug.
> > >
> >
> > Hi Matt,
> >
> > As you know, I'm in favor of making this REST API call only authorized
> > for Cinder to avoid messing the cloud.
> >
> > I know you wanted Cinder to have a solution to do live migrations of
> > volumes with snapshots, and while this is not possible to do in a
> > reasonable fashion, I kept thinking about it given your strong feelings
> > to provide a solution for users that really need this, and I think we
> > may have a "reasonable" compromise.
> >
> > The solution is conceptually simple. We add a new API microversion in
> > Cinder that adds and optional parameter called "generic_keep_source"
> > (defaults to False) to both migrate and retype operations.
> >
> > This means that if the driver optimized migration cannot do the
> > migration and the generic migration code is the one doing the migration,
> > then, instead of our final step being to swap the volume id's and
> > deleting the source volume, what we would do is to swap the volume id's
> > and move all the snapshots to reference the new volume. Then we would
> > create a user message with the new ID of the volume.
> >
> > This way we can preserve the old volume with all its snapshots and do
> > the live migration.
> >
> > The implementation is a little bit tricky, as we'll have to add anew
> > "update_migrated_volume" mechanism to support the renaming of both
> > volumes, since the old one wouldn't work with this among other things,
> > but it's doable.
> >
> > Unfortunately I don't have the time right now to work on this...
>
> Sounds promising, and honestly more than I'd have hoped for.
>
> Matt
>
Hi Matt,
Reading Sean's reply I notice that I phrased that wrong. The volume on
the new storage backend wouldn't have any snapshots.
The result of the operation would be a new volume with the old ID and no
snapshots (this would be the one in use by Nova), and the old volume
with all the snapshots having a new ID on the DB.
Due to Cinder's mechanism to create this new volume we wouldn't be
returning it on the REST API call, but as a user message instead.
Sorry for the confusion.
Cheers,
Gorka.
More information about the OpenStack-dev
mailing list