[openstack-dev] [Nova][Neutron][Live-Migration] Cross l2 agent migration and solving Nova-Neutron live migration bugs

Murray, Paul (HP Cloud) pmurray at hpe.com
Fri Jul 1 14:34:38 UTC 2016



> -----Original Message-----
> From: Carl Baldwin [mailto:carl at ecbaldwin.net]
> Sent: 29 June 2016 22:20
> To: OpenStack Development Mailing List (not for usage questions)
> <openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] [Nova][Neutron][Live-Migration] Cross l2 agent
> migration and solving Nova-Neutron live migration bugs
> 
> On Tue, Jun 28, 2016 at 9:42 AM, Andreas Scheuring
> <scheuran at linux.vnet.ibm.com> wrote:
> > I'm currently working on solving Nova-Neutron issues during Live
> > Migration. This mail is intended to raise awareness cross project and
> > get things kicked off.
> 
> Thanks for sending this.
> 
> > The issues
> > ==========
> >
> > #1 When portbinding fails, instance is migrated but stuck in error
> > state
> > #2 Macvtap agent live Migration when source and target use different
> > physical_interface_mapping [3]. Either the migration fails (good case)
> > or it would place the instance on a wrong network (worst case)
> > #3 (more a feature):  Migration cross l2 agent not possible (e.g.
> > migrate from lb host to ovs host, or from ovs-hybrid to new
> > ovsfirewall
> > host)
> 
> Since all of these issues are experienced when using live migration, a Nova
> feature, I was interested in finding out how Nova prioritizes getting a fix and
> then trying to align Neutron's priority to that priority.  It would seem artificial
> to me to drive priority from Neutron.  That's why I mentioned it.  Since Nova
> is hitting a freeze for non-priority work tomorrow, I don't think anything can
> be done for Newton.  However, there is a lot of time to tee up this
> conversation for Ocata if we get on it.
> 

I read the comments from the neutron meeting here: http://eavesdrop.openstack.org/meetings/neutron_drivers/2016/neutron_drivers.2016-06-30-22.00.log.html#l-320 and thought a little commentary on our priorities over the last couple of cycles might avoid any misconception.

Live migration was a priority in Mitaka because it simply was not up to scratch for production use. The main objective was to make it work and make it manageable. That translated to:

1. Expand CI coverage
2. Manage migrations: monitor progress, cancel migrations, force spinning migrations to complete
3. Extend use cases: allow mix of volumes, shared storage and local disks to be migrated
4. Some other things: simplify config and APIs, scheduling support, separate migration traffic from management network

These were mostly covered including some supporting work on qemu and libvirt as well.

We next wanted to do some security work (refactoring image backend and removing ssh-based copy - aka storage pools) but that could not be done in Mitaka and was deferred. The priority for Newton was specifically this security work and continuing efforts on CI which is now making progress (ref: the sterling work by Daniel Berrange in the last couple of days: http://lists.openstack.org/pipermail/openstack-dev/2016-June/098540.html )

 
> > The proposal
> > ============
> > All those problems could be solved with the same approach . The
> > proposal is, to bind a port to the source AND to the target port
> > during migration.
> >

Good - it is frankly ridiculous that we irreversibly commit a migration to the destination host before we know the networks can be built out. This is basically what we do with cinder - volumes are attached at both source and destination during the migration.


> > * Neutron would need to allow multiple bindings for a compute port and
> > externalize that via API.
> >   - Neutron Spec [1]
> >   - Bug [4]  is a prereq to the spec.
> >
> > * Nova would need to use those new APIs to check in
> > pre_live_migration, if the binding for target host is valid and to
> > modify the instance definition (e.g. domain.xml) during migration.
> >   - Nova Spec [2]
> >
> > This would solve the issues in the following way:
> > #1 would abort the migration before it started, so instance is still
> > usable
> > #2 Migration is possible with all configurations
> > #3 would allow such a migration
> >
> > Coordination
> > ============
> > Some coordination between Nova & Neutron is required. Along todays
> > Nova Live Migration Meeting [5] this will happen on the Nova midcycle.
> > I put an item on the agenda [6].
> 
> I'll be there.

Yes, Good. 


> 
> > Would be great that anybody that is interested in this bugfix/feature
> > could comment on the specs [1] or [2] to get as much feedback as
> > possible before the nova midcycle in July!
> >
> > Thank you!
> 

I don't think this is actually a large amount of work on the Nova side. We need to get the API right - will leave comments.


> >
> > [1] Neutron spec: https://review.openstack.org/#/c/309416
> > [2] Nova spec: https://review.openstack.org/301090
> > [3] macvtap bug: https://bugs.launchpad.net/neutron/+bug/1550400
> > [4] https://bugs.launchpad.net/neutron/+bug/1367391
> > [5]
> >
> http://eavesdrop.openstack.org/meetings/nova_live_migration/2016/nova_
> > live_migration.2016-06-28-14.00.log.html
> >
> > [6] https://etherpad.openstack.org/p/nova-newton-midcycle
> 
> __________________________________________________________
> ________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list