[openstack-dev] [Nova] Migration state machine proposal.

Andrew Laski andrew at lascii.com
Wed Oct 28 16:53:18 UTC 2015

On 10/27/15 at 09:02am, Jay Pipes wrote:
>On 10/22/2015 11:13 AM, Tang Chen wrote:
>>On 10/22/2015 05:17 AM, Joshua Harlow wrote:
>>>Overall I'm very much inclined to have three state machines (one
>>>for each type), vs the mix-mash of all three into one state machine
>>>(which causes the confusion around states in the first diagram in
>>>that paste).
>>That is an idea. But I would prefer to have one single state machine
>>for migration, because resize and evacuate are reusing migration.
>>They can be in one state machine.
>Evacuate does *not* migrate/move anything. Evacuate *rebuilds* VMs 
>from their original source image.
>I support Nikola in that I believe the different migration types 
>should have different state machines entirely (but be as consistent 
>as possible in the naming of terminal states like "finished" vs 
>"done" etc)


It seems that there's also some work to do to disambiguate what the 
different operations are and what they actually do.

Migrate/resize share a code path, and then there's a big switch to 
change behavior if the migration is a live-migrate.  I agree with the 
comment below that we should deprecate the resize terminology and 
consolidate cold-migrate and resize under one umbrella with one state 
machine.  Then live-migrate could have a separate state machine.

Rebuild/evacuate share a code path though evacuate involves a scheduler 
decision which is why it seems like a move.  It's actually a bit tricky 
to classify because if the instance is volume backed it is essentially 
just a move operation, if it's image backed it's a destructive rebuild.  
But I think it makes sense to not consider this a move operation and 
think of it as an administrative rebuild when a host is down.

>>It would be very helpful if the designer of the migration process
>>could share his idea. But if it is just some code modified by many
>>people many times, I think we should remove the confusing states and
>>give a easier, better state machine.
>There isn't a designer of the migration process :( The original 
>(crap, IMHO) API from Rackspace Cloud Servers API was used for the 
>resize functionality in the compute API and it's been a source of 
>confusion and frustration ever since. Relying on a manual 
>confirmation or revert input from the user was and continues to be a 
>horrible idea.

Agreed.  In my experience with operating a public cloud I am not aware 
of anyone benefiting from the manual checkpoint in the middle of resize.  
But we should solicit feedback from the operator community on this.

>I believe strongly that we should deprecate the existing migrate, 
>resize, an live-migrate APIs in favor of a single consolidated, 
>consistent "move" REST API that would have the following 

I'm not sure that we should abstract away live vs cold migrate behind a 
single move API, but I strongly agree with consolidate cold-migrate and 

>* No manual or wait-input states in any FSM graph
>* Removal of the term "resize" from the API entirely (the target 
>resource sizing is an attribute of the move operation, not a 
>different type of API operation in and of itself)
>* Transition to a task-based API for poll-state requests. This means 
>that in order for a caller to determine the state of a VM the caller 
>would call something like GET /servers/<UUID>/tasks/<UUID> in order 
>to see the history of state changes or subtask operations for a 
>particular request to move a VM

Huge +1 from me.

>Timofei Durakov (cc'd) has a blueprint for splitting the 
>live-migration types into separate task classes here:
>I think there's a lot of good ideas in that proposal. Please do have 
>a look at it.
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

More information about the OpenStack-dev mailing list