[openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options
Mark McLoughlin
markmc at redhat.com
Fri Jan 8 18:28:45 UTC 2016
On Fri, 2016-01-08 at 14:11 +0000, Daniel P. Berrange wrote:
> On Thu, Jan 07, 2016 at 09:07:00PM +0000, Mark McLoughlin wrote:
> > On Thu, 2016-01-07 at 12:23 +0100, Sahid Orentino Ferdjaoui wrote:
> > > On Mon, Jan 04, 2016 at 09:12:06PM +0000, Mark McLoughlin wrote:
> > > > Hi
> > > >
> > > > commit 8ecf93e[1] got me thinking - the live_migration_flag config
> > > > option unnecessarily allows operators choose arbitrary behavior of the
> > > > migrateToURI() libvirt call, to the extent that we allow the operator
> > > > to configure a behavior that can result in data loss[1].
> > > >
> > > > I see that danpb recently said something similar:
> > > >
> > > > https://review.openstack.org/171098
> > > >
> > > > "Honestly, I wish we'd just kill off 'live_migration_flag' and
> > > > 'block_migration_flag' as config options. We really should not be
> > > > exposing low level libvirt API flags as admin tunable settings.
> > > >
> > > > Nova should really be in charge of picking the correct set of flags
> > > > for the current libvirt version, and the operation it needs to
> > > > perform. We might need to add other more sensible config options in
> > > > their place [..]"
> > >
> > > Nova should really handle internal flags and this serie is running in
> > > the right way.
> > >
> > > > ...
> > >
> > > > 4) Add a new config option for tunneled versus native:
> > > >
> > > > [libvirt]
> > > > live_migration_tunneled = true
> > > >
> > > > This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have
> > > > historically defaulted to tunneled mode because it requires the
> > > > least configuration and is currently the only way to have a
> > > > secure migration channel.
> > > >
> > > > danpb's quote above continues with:
> > > >
> > > > "perhaps a "live_migration_secure_channel" to indicate that
> > > > migration must use encryption, which would imply use of
> > > > TUNNELLED flag"
> > > >
> > > > So we need to discuss whether the config option should express the
> > > > choice of tunneled vs native, or whether it should express another
> > > > choice which implies tunneled vs native.
> > > >
> > > > https://review.openstack.org/263434
> > >
> > > We probably have to consider that operator does not know much about
> > > internal libvirt flags, so options we are exposing for him should
> > > reflect benefice of using them. I commented on your review we should
> > > at least explain benefice of using this option whatever the name is.
> >
> > As predicted, plenty of discussion on this point in the review :)
> >
> > You're right that we don't give the operator any guidance in the help
> > message about how to choose true or false for this:
> >
> > Whether to use tunneled migration, where migration data is
> > transported over the libvirtd connection. If True,
> > we use the VIR_MIGRATE_TUNNELLED migration flag
> >
> > libvirt's own docs on this are here:
> >
> > https://libvirt.org/migration.html#transport
> >
> > which emphasizes:
> >
> > - the data copies involved in tunneling
> > - the extra configuration steps required for native
> > - the encryption support you get when tunneling
> >
> > The discussions I've seen on this topic wrt Nova have revolved around:
> >
> > - that tunneling allows for an encrypted transport[1]
> > - that qemu's NBD based drive-mirror block migration isn't supported
> > using tunneled mode, and that danpb is working on fixing this
> > limitation in libvirt
> > - "selective" block migration[2] won't work with the fallback qemu
> > block migration support, and so won't currently work in tunneled
> > mode
>
> I'm not working on fixing it, but IIRC some other dev had proposed
> patches.
>
> >
> > So, the advise to operators would be:
> >
> > - You may want to choose tunneled=False for improved block migration
> > capabilities, but this limitation will go away in future.
> > - You may want to choose tunneled=False if you wish to trade and
> > encrypted transport for a (potentially negligible) performance
> > improvement.
> >
> > Does that make sense?
> >
> > As for how to name the option, and as I said in the review, I think it
> > makes sense to be straightforward here and make it clearly about
> > choosing to disable libvirt's tunneled transport.
> >
> > If we name it any other way, I think our explanation for operators will
> > immediately jump to explaining (a) that it influences the TUNNELLED
> > flag, and (b) the differences between the tunneled and native
> > transports. So, if we're going to have to talk about tunneled versus
> > native, why obscure that detail?
>
> Ultimately we need to recognise that libvirt's tunnelled mode was
> added as a hack, to work around fact that QEMU lacked any kind of
> native security capabilities & didn't appear likely to ever get
> them at that time. As well as not working with modern NBD based
> block device encryption, it really sucks for performance because
> it introduces many extra data copies. So it is going to be quite
> poor for large VMs with heavy rate of data dirtying.
>
> The only long term relative "benefit" of tunnelled mode is that
> it avoids the need to open extra firewall ports.
>
> IMHO, the long term future is to *never* use tunnelled mode for
> QEMU. This will be viable when my support for native TLS support
> in QEMU migration + NBD protocols is merged. I'm hopeful this
> wil be for QEMU 2.6
>
> > But, Pawel strongly disagrees.
> >
> > One last point I'd make is this isn't about adding a *new*
> > configuration capability for operators. As we deprecate and remove
> > these configuration options, we need to be careful not to remove a
> > capability that operators are currently depending on for arguably
> > reasonable reasons.
>
> My view is that "live_migration_tunneled" is a reasonable parameter
> to add, because there is a genuine need to let admins choose this
> behaviour. We should make sure it is correctly done as a tri-state
> flag though, when it is 'None', Nova should pick what it things is
> the best approach based on QEMU version. Probably to use QEMU
> native when it supports TLS, otherwise use tunnelled if possible
> to get security.
Great feedback. I buy that.
> > [1] - https://review.openstack.org/#/c/171098/
> > [2] - https://review.openstack.org/#/c/227278
> >
> >
> > > > 5) Add a new config option for additional migration flags:
> > > >
> > > > [libvirt]
> > > > live_migration_extra_flags = VIR_MIGRATE_COMPRESSED
> > > >
> > > > This allows operators to continue to experiment with libvirt behaviors
> > > > in safe ways without each use case having to be accounted for.
> > > >
> > > > https://review.openstack.org/263435
> > > >
> > > > We would disallow setting the following flags via this option:
> > > >
> > > > VIR_MIGRATE_LIVE
> > > > VIR_MIGRATE_PEER2PEER
> > > > VIR_MIGRATE_TUNNELLED
> > > > VIR_MIGRATE_PERSIST_DEST
> > > > VIR_MIGRATE_UNDEFINE_SOURCE
> > > > VIR_MIGRATE_NON_SHARED_INC
> > > > VIR_MIGRATE_NON_SHARED_DISK
> > > >
> > > > which would allow the following currently available flags to be set:
> > >
> > > > VIR_MIGRATE_PAUSED
> > > > VIR_MIGRATE_CHANGE_PROTECTION
> > > > VIR_MIGRATE_UNSAFE
> > > > VIR_MIGRATE_OFFLINE
> > > > VIR_MIGRATE_COMPRESSED
> > > > VIR_MIGRATE_ABORT_ON_ERROR
> > > > VIR_MIGRATE_AUTO_CONVERGE
> > > > VIR_MIGRATE_RDMA_PIN_ALL
> > >
> > > We can probably consider to provide VIR_MIGRATE_PAUSED and
> > > VIR_MIGRATE_COMPRESSED as dedicated options too ?
> >
> > Yes, I think any options we see regularly added to extra_flags by
> > operators, and as we understand the use cases better, then we can add
> > dedicated options for them.
>
> I really don't see a case for letting the admin set VIR_MIGRATE_PAUSED
> at a host level. If we want the ability to force a running VM to end
> up paused after migration, this is a feature to be added to the Nova
> migration API.
>
> The VIR_MIGRATE_COMPRESSED is not as simple as just enabling a flag,
> there are other associated runtime tunables that need setting. There
> was a spec discussing this which was not approved as a suitable
> strategy for using it could not be agreed.
>
> > In the review, Pawel is making a case for not allowing the operator to
> > enable COMPRESSED or AUTO_CONVERGE.
>
> I agree really. As per my comments, I in fact struggle to see a credible
> case for allowing any of the remaining flags to be enabled. They are all
> cases that Nova should be made todo the right thing, possibly in relation
> to API parameters or other deployment choices.
Fair enough. I figured it would be a necessary safety valve against
operators who value the flexibility of the current configuration
options, but you make a good case.
I'll drop the extra_flags option and make tunneled a tri-state.
Thanks,
Mark.
More information about the OpenStack-dev
mailing list