[openstack-dev] [Nova] Admin API to fix failed migrations
kaushik.chandrashekar at rackspace.com
Thu Jun 6 08:56:39 UTC 2013
We are trying to find a solution for stuck/errored resizes in nova.
Admins should be able to fix resizes regardless of what state or step in
the process, the instance is in.
A resize can be reverted or confirmed only after it's finished. It depends
on the instance's vm_state and task_state and the migration status.
There are some scenarios in which the migration is stuck or goes to error.
For instance, when the rsync process dies midway due to deployments or the
compute node restarts. In such cases, admins can try to recover the
migration and confirm it or revert the migration to get the original
instance up and running.
The admins are forced to update the db and mark the migration as finished,
so that they can either confirm or revert the resize. They are spending a
lot of time in getting nova db into the desired state.
1. Add a new API that would set the vm_state, task_state and migrations
status such that the migration can be reverted or confirmed. Or we can
also extend an existing API like reset-state to take in a flag like
"--error-with-failed-migration". This does not expose
any vulnerabilities of exposing a liberal API that would allow admins to
update db field with free-form values. But it's very restrictive in terms
of setting the right values to mark the migration as finished.
2. Allow admins to revert or confirm a migration regardless of the
instance and migration states.
Let us know your thoughts/suggestions on this.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev