[openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options

Mark McLoughlin markmc at redhat.com
Fri Jan 8 18:28:45 UTC 2016


On Fri, 2016-01-08 at 14:11 +0000, Daniel P. Berrange wrote:
> On Thu, Jan 07, 2016 at 09:07:00PM +0000, Mark McLoughlin wrote:
> > On Thu, 2016-01-07 at 12:23 +0100, Sahid Orentino Ferdjaoui wrote:
> > > On Mon, Jan 04, 2016 at 09:12:06PM +0000, Mark McLoughlin wrote:
> > > > Hi
> > > > 
> > > > commit 8ecf93e[1] got me thinking - the live_migration_flag config
> > > > option unnecessarily allows operators choose arbitrary behavior of the
> > > > migrateToURI() libvirt call, to the extent that we allow the operator
> > > > to configure a behavior that can result in data loss[1].
> > > > 
> > > > I see that danpb recently said something similar:
> > > > 
> > > >   https://review.openstack.org/171098
> > > > 
> > > >   "Honestly, I wish we'd just kill off  'live_migration_flag' and
> > > >   'block_migration_flag' as config options. We really should not be
> > > >   exposing low level libvirt API flags as admin tunable settings.
> > > > 
> > > >   Nova should really be in charge of picking the correct set of flags
> > > >   for the current libvirt version, and the operation it needs to
> > > >   perform. We might need to add other more sensible config options in
> > > >   their place [..]"
> > > 
> > > Nova should really handle internal flags and this serie is running in
> > > the right way.
> > > 
> > > > ...
> > > 
> > > >   4) Add a new config option for tunneled versus native:
> > > > 
> > > >        [libvirt]
> > > >        live_migration_tunneled = true
> > > > 
> > > >      This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have 
> > > >      historically defaulted to tunneled mode because it requires the 
> > > >      least configuration and is currently the only way to have a 
> > > >      secure migration channel.
> > > > 
> > > >      danpb's quote above continues with:
> > > > 
> > > >        "perhaps a "live_migration_secure_channel" to indicate that 
> > > >         migration must use encryption, which would imply use of 
> > > >         TUNNELLED flag"
> > > > 
> > > >      So we need to discuss whether the config option should express the
> > > >      choice of tunneled vs native, or whether it should express another
> > > >      choice which implies tunneled vs native.
> > > > 
> > > >        https://review.openstack.org/263434
> > > 
> > > We probably have to consider that operator does not know much about
> > > internal libvirt flags, so options we are exposing for him should
> > > reflect benefice of using them. I commented on your review we should
> > > at least explain benefice of using this option whatever the name is.
> > 
> > As predicted, plenty of discussion on this point in the review :)
> > 
> > You're right that we don't give the operator any guidance in the help
> > message about how to choose true or false for this:
> > 
> >   Whether to use tunneled migration, where migration data is 
> >   transported over the libvirtd connection. If True,
> >   we use the VIR_MIGRATE_TUNNELLED migration flag
> > 
> > libvirt's own docs on this are here:
> > 
> >   https://libvirt.org/migration.html#transport
> > 
> > which emphasizes:
> > 
> >   - the data copies involved in tunneling
> >   - the extra configuration steps required for native
> >   - the encryption support you get when tunneling
> > 
> > The discussions I've seen on this topic wrt Nova have revolved around:
> > 
> >   - that tunneling allows for an encrypted transport[1]
> >   - that qemu's NBD based drive-mirror block migration isn't supported
> >     using tunneled mode, and that danpb is working on fixing this
> >     limitation in libvirt
> >   - "selective" block migration[2] won't work with the fallback qemu
> >     block migration support, and so won't currently work in tunneled
> >     mode
> 
> I'm not working on fixing it, but IIRC some other dev had proposed
> patches.
> 
> > 
> > So, the advise to operators would be:
> > 
> >   - You may want to choose tunneled=False for improved block migration 
> >     capabilities, but this limitation will go away in future.
> >   - You may want to choose tunneled=False if you wish to trade and
> >     encrypted transport for a (potentially negligible) performance
> >     improvement.
> > 
> > Does that make sense?
> > 
> > As for how to name the option, and as I said in the review, I think it
> > makes sense to be straightforward here and make it clearly about
> > choosing to disable libvirt's tunneled transport.
> > 
> > If we name it any other way, I think our explanation for operators will
> > immediately jump to explaining (a) that it influences the TUNNELLED
> > flag, and (b) the differences between the tunneled and native
> > transports. So, if we're going to have to talk about tunneled versus
> > native, why obscure that detail?
> 
> Ultimately we need to recognise that libvirt's tunnelled mode was
> added as a hack, to work around fact that QEMU lacked any kind of
> native security capabilities & didn't appear likely to ever get
> them at that time.  As well as not working with modern NBD based
> block device encryption, it really sucks for performance because
> it introduces many extra data copies. So it is going to be quite
> poor for large VMs with heavy rate of data dirtying.
> 
> The only long term relative "benefit" of tunnelled mode is that
> it avoids the need to open extra firewall ports.
> 
> IMHO, the long term future is to *never* use tunnelled mode for
> QEMU. This will be viable when my support for native TLS support
> in QEMU migration + NBD protocols is merged. I'm hopeful this
> wil be for QEMU 2.6
> 
> > But, Pawel strongly disagrees.
> > 
> > One last point I'd make is this isn't about adding a *new*
> > configuration capability for operators. As we deprecate and remove
> > these configuration options, we need to be careful not to remove a
> > capability that operators are currently depending on for arguably
> > reasonable reasons.
> 
> My view is that "live_migration_tunneled" is a reasonable parameter
> to add, because there is a genuine need to let admins choose this
> behaviour. We should make sure it is correctly done as a tri-state
> flag though, when it is 'None', Nova should pick what it things is
> the best approach based on QEMU version. Probably to use QEMU
> native when it supports TLS, otherwise use tunnelled if possible
> to get security.

Great feedback. I buy that.

> > [1] - https://review.openstack.org/#/c/171098/
> > [2] - https://review.openstack.org/#/c/227278
> > 
> > 
> > > >   5) Add a new config option for additional migration flags:
> > > > 
> > > >        [libvirt]
> > > >        live_migration_extra_flags = VIR_MIGRATE_COMPRESSED
> > > > 
> > > >      This allows operators to continue to experiment with libvirt behaviors
> > > >      in safe ways without each use case having to be accounted for.
> > > > 
> > > >        https://review.openstack.org/263435
> > > > 
> > > >      We would disallow setting the following flags via this option:
> > > > 
> > > >        VIR_MIGRATE_LIVE
> > > >        VIR_MIGRATE_PEER2PEER
> > > >        VIR_MIGRATE_TUNNELLED
> > > >        VIR_MIGRATE_PERSIST_DEST
> > > >        VIR_MIGRATE_UNDEFINE_SOURCE
> > > >        VIR_MIGRATE_NON_SHARED_INC
> > > >        VIR_MIGRATE_NON_SHARED_DISK
> > > > 
> > > >     which would allow the following currently available flags to be set:
> > > 
> > > >        VIR_MIGRATE_PAUSED
> > > >        VIR_MIGRATE_CHANGE_PROTECTION
> > > >        VIR_MIGRATE_UNSAFE
> > > >        VIR_MIGRATE_OFFLINE
> > > >        VIR_MIGRATE_COMPRESSED
> > > >        VIR_MIGRATE_ABORT_ON_ERROR
> > > >        VIR_MIGRATE_AUTO_CONVERGE
> > > >        VIR_MIGRATE_RDMA_PIN_ALL
> > > 
> > > We can probably consider to provide VIR_MIGRATE_PAUSED and
> > > VIR_MIGRATE_COMPRESSED as dedicated options too ?
> > 
> > Yes, I think any options we see regularly added to extra_flags by
> > operators, and as we understand the use cases better, then we can add
> > dedicated options for them.
> 
> I really don't see a case for letting the admin set VIR_MIGRATE_PAUSED
> at a host level. If we want the ability to force a running VM to end
> up paused after migration, this is a feature to be added to the Nova
> migration API.
> 
> The VIR_MIGRATE_COMPRESSED is not as simple as just enabling a flag,
> there are other associated runtime tunables that need setting. There
> was a spec discussing this which was not approved as a suitable
> strategy for using it could not be agreed.
> 
> > In the review, Pawel is making a case for not allowing the operator to
> > enable COMPRESSED or AUTO_CONVERGE.
> 
> I agree really. As per my comments, I in fact struggle to see a credible
> case for allowing any of the remaining flags to be enabled. They are all
> cases that Nova should be made todo the right thing, possibly in relation
> to API parameters or other deployment choices.

Fair enough. I figured it would be a necessary safety valve against
operators who value the flexibility of the current configuration
options, but you make a good case.

I'll drop the extra_flags option and make tunneled a tri-state.

Thanks,
Mark.



More information about the OpenStack-dev mailing list