[Openstack-operators] Need feedback for nova aborting cold migration function

sagaray at nttdata.co.jp sagaray at nttdata.co.jp
Thu May 10 02:33:16 UTC 2018

Hi Takashi, and guys,

We are operating large telco enterprise cloud.

We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.

Operation planning of cold migration is difficult because cold migration time will vary drastically as it also depends on the load on storage servers at that point of time. If cold migration task stalls for any unknown reasons, operators may decide to cancel it manually. This requires several manual steps to be carried out for recovering from such situation such as kill the copy process, reset-state, stop, and start the VM. If we have the ability to cancel cold migration, we can resume our service safely even though the migration is not complete in the stipulated maintenance time window.

As of today, we can solve the above issue by following manual procedure to recover instances from cold migration failure but we still need to follow these steps every time. We can build our own tool to automate this process but we will need to maintain it by ourselves as this feature is not supported by any OpenStack distribution.

If Nova supports function to cancel cold migration, it’s definitely going to help us to bring instances back from cold migration failure thus improving service availability to our end users. Secondly, we don’t need to worry about maintaining procedure manual or proprietary tool by ourselves which will be a huge win for us.

We are definitely interested in this function and we would love to see it in the next coming release.

Thank you for your hard work.

Yukinori Sagara <sagaray at nttdata.co.jp>
Platform Engineering Department, NTT DATA Corp.

> Hi everyone,
> I'm going to add the aborting cold migration function [1] in nova.
> I would like to ask operators' feedback on this.
> The cold migration is an administrator operation by default.
> If administrators perform cold migration and it is stalled out,
> users cannot do their operations (e.g. starting the VM).
> In that case, if administrators can abort the cold migration by using
> this function,
> it enables users to operate their VMs.
> If you are a person like the following, would you reply to this mail?
> * Those who need this function
> * Those who will use this function if it is implemented
> * Those who think that it is better to have this function
> * Those who are interested in this function
> [1] https://review.openstack.org/#/c/334732/
> Regards,
> Takashi Natsume
> NTT Software Innovation Center
> E-mail: natsume.takashi at lab.ntt.co.jp

More information about the OpenStack-operators mailing list