[openstack-dev] [nova] Migration progress

John Garbutt john at johngarbutt.com
Mon Nov 23 11:02:50 UTC 2015


On 23 November 2015 at 08:36, Paul Carlton <paul.carlton2 at hpe.com> wrote:
> John
>
> At the live migration sub team meeting I undertook to look at the issue
> of progress reporting.
>
> The use cases I'm envisaging are...
>
> As a user I want to know how much longer my instance will be migrating
> for.
>
> As an operator I want to identify any migration that are making slow
>  progress so I can expedite their progress or abort them.

+1

Agreed with this need.

Proposals to add pause and cancel clearly make this need more acute.

> The current implementation reports on the instance's migration with
> respect to memory transfer, using the total memory and memory remaining
> fields from libvirt to report the percentage of memory still to be
> transferred.  Due to the instance writing to pages already transferred
> this percentage can go up as well as down.  Daniel has done a good job
> of generating regular log records to report progress and highlight lack
> of progress but from the API all a user/operator can see is the current
> percentage complete.  By observing this periodically they can identify
> instance migrations that are struggling to migrate memory pages fast
> enough to keep pace with the instance's memory updates.
>
> The problem is that at present we have only one field, the instance
> progress, to record progress.  With a live migration there are measures
> of progress, how much of the ephemeral disks (not needed for shared
> disk setups) have been copied and how much of the memory has been
> copied. Both can go up and down as the instance writes to pages already
> copied causing those pages to need to be copied again.  As Daniel says
> in his comments in the code, the disk size could dwarf the memory so
> reporting both in single percentage number is problematic.
>
> We could add an additional progress item to the instance object, i.e.
> disk progress and memory progress but that seems odd to have an
> additional progress field only for this operation so this is probably
> a non starter!
>
> For operations staff with access to log files we could report disk
> progress as well as memory in the log file, however that does not
> address the needs of users and whilst log files are the right place for
> support staff to look when investigating issues operational tooling
> is much better served by notification messages.
>
> Thus I'd recommend generating periodic notifications during a migration
> to report both memory and disk progress would be useful?  Cloud
> operators are likely to manage their instance migration activity using
> some orchestration tooling which could consume these notifications and
> deduce what challenges the instance migration is encountering and thus
> determine how to address any issues.

To be clear, our notifications are not designed to be consumed by end users.

> The use cases are only partially addressed by the current
> implementation, they can repeatedly get the server details and look at
> the progress percentage to see how quickly (or even if) it is
> increasing and determine how long the instance is likely to be
> migrating for.  However for an instance that has a large disk and/or
> is doing a high rate of disk i/o they may see the percentage complete
> (i.e. memory) repeatedly showing 90%+ but the instance migration does
> not complete.

Agreed reporting progress, particularly with live-migrate, is awful right now.

Long term, I have my eye on this work:
https://etherpad.openstack.org/p/liberty-cross-project-user-notifications

But we should work on getting a good conceptual model for the progress
that can be exposed using the above system.

> The nova spec https://review.openstack.org/#/c/248472/ suggests making
> detailed information available via the os-migrations object.  This is
> not a bad idea but I have some issues with the implementation that I
> will share on that spec.

We do also need something that works across all hypervisor types.

Lets talk more on that spec review.

Thanks,
johnthetubaguy



More information about the OpenStack-dev mailing list