[openstack-dev] [nova] Migration progress

Paul Carlton paul.carlton2 at hpe.com
Mon Nov 23 08:36:32 UTC 2015


John

At the live migration sub team meeting I undertook to look at the issue
of progress reporting.

The use cases I'm envisaging are...

As a user I want to know how much longer my instance will be migrating
for.

As an operator I want to identify any migration that are making slow
  progress so I can expedite their progress or abort them.

The current implementation reports on the instance's migration with
respect to memory transfer, using the total memory and memory remaining
fields from libvirt to report the percentage of memory still to be
transferred.  Due to the instance writing to pages already transferred
this percentage can go up as well as down.  Daniel has done a good job
of generating regular log records to report progress and highlight lack
of progress but from the API all a user/operator can see is the current
percentage complete.  By observing this periodically they can identify
instance migrations that are struggling to migrate memory pages fast
enough to keep pace with the instance's memory updates.

The problem is that at present we have only one field, the instance
progress, to record progress.  With a live migration there are measures
of progress, how much of the ephemeral disks (not needed for shared
disk setups) have been copied and how much of the memory has been
copied. Both can go up and down as the instance writes to pages already
copied causing those pages to need to be copied again.  As Daniel says
in his comments in the code, the disk size could dwarf the memory so
reporting both in single percentage number is problematic.

We could add an additional progress item to the instance object, i.e.
disk progress and memory progress but that seems odd to have an
additional progress field only for this operation so this is probably
a non starter!

For operations staff with access to log files we could report disk
progress as well as memory in the log file, however that does not
address the needs of users and whilst log files are the right place for
support staff to look when investigating issues operational tooling
is much better served by notification messages.

Thus I'd recommend generating periodic notifications during a migration
to report both memory and disk progress would be useful?  Cloud
operators are likely to manage their instance migration activity using
some orchestration tooling which could consume these notifications and
deduce what challenges the instance migration is encountering and thus
determine how to address any issues.

The use cases are only partially addressed by the current
implementation, they can repeatedly get the server details and look at
the progress percentage to see how quickly (or even if) it is
increasing and determine how long the instance is likely to be
migrating for.  However for an instance that has a large disk and/or
is doing a high rate of disk i/o they may see the percentage complete
(i.e. memory) repeatedly showing 90%+ but the instance migration does
not complete.

The nova spec https://review.openstack.org/#/c/248472/ suggests making
detailed information available via the os-migrations object.  This is
not a bad idea but I have some issues with the implementation that I
will share on that spec.

-- Paul Carlton Software Engineer Cloud Services
Hewlett Packard Enterprise
BUK03:T242
Longdown Avenue
Stoke Gifford
Bristol BS34 8QZ
Mobile: +44 (0)7768 994283
Email: mailto:paul.carlton2 at hpe.com
Hewlett-Packard Enterprise Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 
690597 England.
The contents of this message and any attachments to it are confidential 
and may be legally privileged.
If you have received this message in error, you should delete it from 
your system immediately and advise the sender.
To any recipient of this message within HP, unless otherwise stated you 
should consider this message and attachments as "HP CONFIDENTIAL".



More information about the OpenStack-dev mailing list