[openstack-dev] [Nova] Migration state machine proposal.
Tang Chen
tangchen at cn.fujitsu.com
Thu Oct 29 01:46:53 UTC 2015
On 10/29/2015 09:26 AM, Tang Chen wrote:
> Hi Andrew,
>
> On 10/29/2015 12:53 AM, Andrew Laski wrote:
>> On 10/27/15 at 09:02am, Jay Pipes wrote:
>>> On 10/22/2015 11:13 AM, Tang Chen wrote:
>>>> On 10/22/2015 05:17 AM, Joshua Harlow wrote:
>>>>> Overall I'm very much inclined to have three state machines (one
>>>>> for each type), vs the mix-mash of all three into one state machine
>>>>> (which causes the confusion around states in the first diagram in
>>>>> that paste).
>>>>
>>>> That is an idea. But I would prefer to have one single state machine
>>>> for migration, because resize and evacuate are reusing migration.
>>>> They can be in one state machine.
>>>
>>> Evacuate does *not* migrate/move anything. Evacuate *rebuilds* VMs
>>> from their original source image.
>>>
>>> I support Nikola in that I believe the different migration types
>>> should have different state machines entirely (but be as consistent
>>> as possible in the naming of terminal states like "finished" vs
>>> "done" etc)
>>
>> +1.
>>
>> It seems that there's also some work to do to disambiguate what the
>> different operations are and what they actually do.
>>
>> Migrate/resize share a code path, and then there's a big switch to
>> change behavior if the migration is a live-migrate. I agree with the
>> comment below that we should deprecate the resize terminology and
>> consolidate cold-migrate and resize under one umbrella with one state
>> machine. Then live-migrate could have a separate state machine.
>>
>> Rebuild/evacuate share a code path though evacuate involves a
>> scheduler decision which is why it seems like a move. It's actually
>> a bit tricky to classify because if the instance is volume backed it
>> is essentially just a move operation, if it's image backed it's a
>> destructive rebuild. But I think it makes sense to not consider this
>> a move operation and think of it as an administrative rebuild when a
>> host is down.
>
> I'm little think of that we should define each migration type clearly
> first, and then improve migration state machine. If we don't agree on
> what type represents what operation, the state machine won't be good.
>
> Please also help to review this BP, although the idea may be not
> exactly the same as yours.
https://blueprints.launchpad.net/nova/+spec/migration-type-refactor
https://blueprints.launchpad.net/nova/+spec/migration-state-field-machine
>
> Thanks.
>
>>
>>>
>>>> It would be very helpful if the designer of the migration process
>>>> could share his idea. But if it is just some code modified by many
>>>> people many times, I think we should remove the confusing states and
>>>> give a easier, better state machine.
>>>
>>> There isn't a designer of the migration process :( The original
>>> (crap, IMHO) API from Rackspace Cloud Servers API was used for the
>>> resize functionality in the compute API and it's been a source of
>>> confusion and frustration ever since. Relying on a manual
>>> confirmation or revert input from the user was and continues to be a
>>> horrible idea.
>>
>> Agreed. In my experience with operating a public cloud I am not
>> aware of anyone benefiting from the manual checkpoint in the middle
>> of resize. But we should solicit feedback from the operator
>> community on this.
>>
>>>
>>> I believe strongly that we should deprecate the existing migrate,
>>> resize, an live-migrate APIs in favor of a single consolidated,
>>> consistent "move" REST API that would have the following
>>> characteristics:
>>
>> I'm not sure that we should abstract away live vs cold migrate behind
>> a single move API, but I strongly agree with consolidate cold-migrate
>> and resize.
>>
>>>
>>> * No manual or wait-input states in any FSM graph
>>> * Removal of the term "resize" from the API entirely (the target
>>> resource sizing is an attribute of the move operation, not a
>>> different type of API operation in and of itself)
>>> * Transition to a task-based API for poll-state requests. This means
>>> that in order for a caller to determine the state of a VM the caller
>>> would call something like GET /servers/<UUID>/tasks/<UUID> in order
>>> to see the history of state changes or subtask operations for a
>>> particular request to move a VM
>>
>> Huge +1 from me.
>>
>>>
>>> Timofei Durakov (cc'd) has a blueprint for splitting the
>>> live-migration types into separate task classes here:
>>>
>>> https://review.openstack.org/#/c/225910/
>>>
>>> I think there's a lot of good ideas in that proposal. Please do have
>>> a look at it.
>>>
>>> Best,
>>> -jay
>>>
>>> __________________________________________________________________________
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> .
>>
>
>
> __________________________________________________________________________
>
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> .
>
More information about the OpenStack-dev
mailing list