[openstack-dev] [Nova] Live Migration: Austin summit update
Murray, Paul (HP Cloud)
pmurray at hpe.com
Sat Apr 30 16:43:47 UTC 2016
Thanks Matt, I meant to cover CI but clearly omitted it.
> On 30 Apr 2016, at 02:35, Matt Riedemann <mriedem at linux.vnet.ibm.com> wrote:
>
>> On 4/29/2016 5:32 PM, Murray, Paul (HP Cloud) wrote:
>> The following summarizes status of the main topics relating to live
>> migration after the Newton design summit. Please feel free to correct
>> any inaccuracies or add additional information.
>>
>>
>>
>> Paul
>>
>>
>>
>> -------------------------------------------------------------
>>
>>
>>
>> Libvirt storage pools
>>
>>
>>
>> The storage pools work has been selected as one of the project review
>> priorities for Newton.
>>
>> (see https://etherpad.openstack.org/p/newton-nova-summit-priorities )
>>
>>
>>
>> Continuation of the libvirt storage pools work was discussed in the live
>> migration session. The proposal has grown to include a refactor of the
>> existing libvirt driver instance storage code. Justification for this is
>> based on three factors:
>>
>> 1. The code needs to be refactored to use storage pools
>>
>> 2. The code is complicated and uses inspection, poor practice
>>
>> 3. During the investigation Matt Booth discovered two CVEs in the
>> code – suggesting further work is justified
>>
>>
>>
>> So the proposal is now to follow three stages:
>>
>> 1. Refactor the instance storage code
>>
>> 2. Adapt to use storage pools for the instance storage
>>
>> 3. Use storage pools to drive resize/migration
>
> We also talked about the need for some additional test coverage for the refactor work:
>
> 1. A job that uses LVM on the experimental queue.
>
> 2. ploop should be covered by the Virtuozzo Compute third party CI but we'll need to double-check the test coverage there (is it running the tests that hit the code paths being refactored). Note that they have their own blueprint for implementing resize for ploop:
>
> https://blueprints.launchpad.net/nova/+spec/virtuozzo-instance-resize-support
>
> 3. Ceph testing - we already have a single-node job for Ceph that will test the resize paths. We should also be testing Ceph-backed live migration in the special live-migration job that Timofey has been working on.
>
> 4. NFS testing - this also falls into the special live migration CI job that will test live migration in different storage configurations within a single run.
>
>>
>>
>>
>> Matt has code already starting the refactor and will continue with help
>> from Paul Carlton + Paul Murray. We will look for additional
>> contributors to help as we plan out the patches.
>>
>>
>>
>> https://review.openstack.org/#/c/302117 : Persist libvirt instance
>> storage metadata
>>
>> https://review.openstack.org/#/c/310505 : Use libvirt storage pools
>>
>> https://review.openstack.org/#/c/310538 : Migrate libvirt volumes
>>
>>
>>
>> Post copy
>>
>>
>>
>> The spec to add post copy migration support in the libvirt driver was
>> discussed in the live migration session. Post copy guarantees completion
>> of a migration in linear time without needing to pause the VM. This can
>> be used as an alternative to pausing in live-migration-force-complete.
>> Pause or complete could also be invoked automatically under some
>> circumstances. The issue slowing these specs is how to decide which
>> method to use given they provide a different user experience but we
>> don’t want to expose virt specific features in the API. Two additional
>> specs listed below suggest possible generic ways to address the issue.
>>
>>
>>
>> There was no conclusions reached in the session so the debate will
>> continue on the specs. The first below is the main spec for the feature.
>>
>>
>>
>> https://review.openstack.org/#/c/301509 : Adds post-copy live migration
>> support to Nova
>>
>> https://review.openstack.org/#/c/305425 : Define instance availability
>> profiles
>>
>> https://review.openstack.org/#/c/306561 : Automatic Live Migration
>> Completion
>>
>>
>>
>> Live Migration orchestrated via conductor
>>
>>
>>
>> The proposal to move orchestration of live migration to conductor was
>> discussed in the working session on Friday, presented by Andrew Laski on
>> behalf of Timofey Durakov. This one threw up a lot of debate both for
>> and against the general idea, but not supporting the patches that have
>> been submitted along with the spec so far. The general feeling was that
>> we need to attack this, but need to take some simple first cleanup steps
>> first to get a better idea of the problem. Dan Smith proposed moving the
>> stateless pre-migration steps to a sequence of calls from conductor (as
>> opposed to the going back and forth between computes) as the first step.
>>
>>
>>
>> https://review.openstack.org/#/c/292271 : Remove compute-compute
>> communication in live-migration
>>
>>
>>
>> Cold and Live Migration Scheduling
>>
>>
>>
>> When this patch merges all migrations will use the request spec for
>> scheduling: https://review.openstack.org/#/c/284974
>>
>> Work is still ongoing for check destinations (allowing the scheduler to
>> check a destination chosen by the admin). When that is complete
>> migrations will have three ways to be placed:
>>
>> 1. Destination chosen by scheduler
>>
>> 2. Destination chosen by admin but checked by scheduler
>>
>> 3. Destination forced by admin
>>
>>
>>
>> https://review.openstack.org/#/c/296408 Re-Proposes to check destination
>> on migrations
>>
>>
>>
>> PCI + NUMA claims
>>
>>
>>
>> Moshe and Jay are making great progress refactoring Nicola’s patches to
>> fix PCI and NUMA handling in migrations. The patch series should be
>> completed soon.
>
> The patch series for that is here (dependent on some cleanups from Jay and the top patch needs to be rebased):
>
> https://review.openstack.org/#/c/307124/
>
> It would be great if we could test this with some NFV CI but from the notes in the session it sounds like we need a multi-node job for this?
>
There were also comments in the Cinder-Nova session requesting that live migration tests should be used to test cinder back ends in external CI. We need to makes sure we have something suitable.
>>
>>
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> Thanks for the great write-up Paul, you've saved me some time. :) And thanks to the whole sub-team working on this for keeping up the focus.
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list