[openstack-dev] [Nova] Live Migration: Austin summit update

Matt Riedemann mriedem at linux.vnet.ibm.com
Sat Apr 30 01:31:57 UTC 2016


On 4/29/2016 5:32 PM, Murray, Paul (HP Cloud) wrote:
> The following summarizes status of the main topics relating to live
> migration after the Newton design summit. Please feel free to correct
> any inaccuracies or add additional information.
>
>
>
> Paul
>
>
>
> -------------------------------------------------------------
>
>
>
> Libvirt storage pools
>
>
>
> The storage pools work has been selected as one of the project review
> priorities for Newton.
>
> (see https://etherpad.openstack.org/p/newton-nova-summit-priorities )
>
>
>
> Continuation of the libvirt storage pools work was discussed in the live
> migration session. The proposal has grown to include a refactor of the
> existing libvirt driver instance storage code. Justification for this is
> based on three factors:
>
> 1.       The code needs to be refactored to use storage pools
>
> 2.       The code is complicated and uses inspection, poor practice
>
> 3.       During the investigation Matt Booth discovered two CVEs in the
> code – suggesting further work is justified
>
>
>
> So the proposal is now to follow three stages:
>
> 1.       Refactor the instance storage code
>
> 2.       Adapt to use storage pools for the instance storage
>
> 3.       Use storage pools to drive resize/migration

We also talked about the need for some additional test coverage for the 
refactor work:

1. A job that uses LVM on the experimental queue.

2. ploop should be covered by the Virtuozzo Compute third party CI but 
we'll need to double-check the test coverage there (is it running the 
tests that hit the code paths being refactored). Note that they have 
their own blueprint for implementing resize for ploop:

https://blueprints.launchpad.net/nova/+spec/virtuozzo-instance-resize-support

3. Ceph testing - we already have a single-node job for Ceph that will 
test the resize paths. We should also be testing Ceph-backed live 
migration in the special live-migration job that Timofey has been 
working on.

4. NFS testing - this also falls into the special live migration CI job 
that will test live migration in different storage configurations within 
a single run.

>
>
>
> Matt has code already starting the refactor and will continue with help
> from Paul Carlton + Paul Murray. We will look for additional
> contributors to help as we plan out the patches.
>
>
>
> https://review.openstack.org/#/c/302117 : Persist libvirt instance
> storage metadata
>
> https://review.openstack.org/#/c/310505 : Use libvirt storage pools
>
> https://review.openstack.org/#/c/310538 : Migrate libvirt volumes
>
>
>
> Post copy
>
>
>
> The spec to add post copy migration support in the libvirt driver was
> discussed in the live migration session. Post copy guarantees completion
> of a migration in linear time without needing to pause the VM. This can
> be used as an alternative to pausing in live-migration-force-complete.
> Pause or complete could also be invoked automatically under some
> circumstances. The issue slowing these specs is how to decide which
> method to use given they provide a different user experience but we
> don’t want to expose virt specific features in the API. Two additional
> specs listed below suggest possible generic ways to address the issue.
>
>
>
> There was no conclusions reached in the session so the debate will
> continue on the specs. The first below is the main spec for the feature.
>
>
>
> https://review.openstack.org/#/c/301509 : Adds post-copy live migration
> support to Nova
>
> https://review.openstack.org/#/c/305425 : Define instance availability
> profiles
>
> https://review.openstack.org/#/c/306561 : Automatic Live Migration
> Completion
>
>
>
> Live Migration orchestrated via conductor
>
>
>
> The proposal to move orchestration of live migration to conductor was
> discussed in the working session on Friday, presented by Andrew Laski on
> behalf of Timofey Durakov. This one threw up a lot of debate both for
> and against the general idea, but not supporting the patches that have
> been submitted along with the spec so far. The general feeling was that
> we need to attack this, but need to take some simple first cleanup steps
> first to get a better idea of the problem. Dan Smith proposed moving the
> stateless pre-migration steps to a sequence of calls from conductor (as
> opposed to the going back and forth between computes) as the first step.
>
>
>
> https://review.openstack.org/#/c/292271 : Remove compute-compute
> communication in live-migration
>
>
>
> Cold and Live Migration Scheduling
>
>
>
> When this patch merges all migrations will use the request spec for
> scheduling: https://review.openstack.org/#/c/284974
>
> Work is still ongoing for check destinations (allowing the scheduler to
> check a destination chosen by the admin). When that is complete
> migrations will have three ways to be placed:
>
> 1.       Destination chosen by scheduler
>
> 2.       Destination chosen by admin but checked by scheduler
>
> 3.       Destination forced by admin
>
>
>
> https://review.openstack.org/#/c/296408 Re-Proposes to check destination
> on migrations
>
>
>
> PCI + NUMA claims
>
>
>
> Moshe and Jay are making great progress refactoring Nicola’s patches to
> fix PCI and NUMA handling in migrations. The patch series should be
> completed soon.

The patch series for that is here (dependent on some cleanups from Jay 
and the top patch needs to be rebased):

https://review.openstack.org/#/c/307124/

It would be great if we could test this with some NFV CI but from the 
notes in the session it sounds like we need a multi-node job for this?

>
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

Thanks for the great write-up Paul, you've saved me some time. :) And 
thanks to the whole sub-team working on this for keeping up the focus.

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list