Open Stack

Fri Jan 18 15:56:45 UTC 2013

Hi Mark,

I think it's a mistake to try and build too much workflow on top of task_state.   Based on a conversation with Yun some time back I thought the approach now was:

If task_state is not null and vm_state is not error it shows that either:
i) some action is in progress
ii)some action has failed and the task_state hasn't been reset.

In the case of i), aside from delete (as not allowing this might have financial impact on the customer) I don't really see a reason for allowing any new operation to start.   All you risk doing is building up a queue of operations for the instance.    Since the current action won't be stopped by a new action its not as if any new action can be used as a cancel.

In the case of ii) there are two sub cases:

ii.a) the thread on the host performing the action failed in some way without clearing the task_state.   This would be a bug in the code that needs to be hunted down and fixed.  If we need to give the user a graceful way to escape from such bugs in the meantime then maybe always allowing a reboot would be OK, but then I think every operation needs to take the instance lock to avoid weird side effects is something is still running

ii.b) The compute-manager  is down.   No point in allowing any new operation (other than delete) in this case, as it won't get actioned.  It would be useful if the state of the service could be reflected back to the user I guess as additional data.    However on restart the compute manager should always take some action which will clear the task_state (as it knows in most cases that  that nothing is running).  For example:

Scheduling:   			do nothing - the request is still in the queue
Block_device_mapping:	go to an error state that only allows delete - a build request was killed.  Can't recover this as some data (like user metadata) will have been lost
Networking:			go to error state that only allows delete - a build request was killed.  Can't recover this as some data (like user metadata) will have been lost
Spawning:			Check if the VM is running (it may have finished starting while the compute manager was down).  If it is running go to Active, otherwise got to Error

Image_*:			Clear the task state, its not snapshotting anymore  (There isn't a way to show the user that this has failed as far as I can see)

Updating_password:		Clear the task state, it either finished or didn't - the user will have to redo the change if they need to

Resize_*:			Go to an Error state 

Rebooting*:			Re-launch the reboot (can recover / redo this)  

Pausing:			Check the power state, if its not consistent with the task_state, re-launch the opertaion
Unpausing:
Suspending:
Resuming:
Stopping:
Starting:
Powering_Off:
Powering_On:

Rescuing:			Check the power state, if its not running redo the rescue (might miss the case where the rescue never started)
UnRescuing:			Check the power state, if its not running redo the unrescue (might miss the case where the unrescue never started)

Rebuilding*:			Go to an error state that only allows rebuild or delete

Migrating*:			<<< Not clear what to do here, do we always know the original state to go back to ?>>>

Deleting*:			Redo the deletion

In your specific scenario the system would have completed the reboot without the user having to need to re-submit it.

Does this make sense, or do I need to increase my dosage again ;-)

Phil

-----Original Message-----
From: Mark McLoughlin [mailto:markmc at redhat.com] 
Sent: 18 January 2013 14:10
To: openstack-dev at lists.openstack.org
Subject: [openstack-dev] Nova's use of task_state for reboot serialization

Hey

Here's a scenario I came across yesterday in a real Folsom deployment:

  https://bugs.launchpad.net/nova/+bug/982108

  - compute node locked up for over 12 due to what looks like a kernel 
    bug

  - during that time, someone came along and tried to reboot their 
    instance with horizon which does a hard reboot

  - the reboot message was cast to the compute node but never picked up 
    so the instance was in task_state=REBOOTING_HARD

  - once the compute node had come back to life, the user tried 
    rebooting again  but wasn't allowed. The instance needed admin 
    intervention to get unstuck.

At first, I thought this was just an oversight that REBOOTING_HARD wasn't one of the allowed states for rebooting.

However, I came across a discussion here:

  https://review.openstack.org/5090

which shows that we're using task_state to prevent multiple reboots of the same type happening at once. I'm assuming that's because the reboots would interfere with me, e.g. attempting to create the same VM twice.

That has me wondering why we just don't take a lock on the instance in the compute manager during a reboot?

Isn't it the case that we should only be using task_state as a kind of "it doesn't make sense to do foo while bar is happening" type policies?
As opposed to task serialization?

Also, if a task is kicked off with a cast we never know if it the message was ever actually received and don't know to revert the task_state in the case? If we want asynchrony in these cases, shouldn't be call() but have the compute node spawn off a greenthread to carry out the action? That way the message is acknowledged and we know that task_state reversion should happen if it fails from that point on.

In summary, the patch I'd cook up for this would change the reboot
cast() to call() and have nova-compute spawn off a greenthread which takes the instance lock for the duration of the reboot and reverts task_state on failure.

Am I missing something?

Is this a fundamental change from our previous thinking and we should audit for similar problems, or is this just an individual oddity?

Cheers,
Mark.

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Open Stack

[openstack-dev] Nova's use of task_state for reboot serialization

OpenStack

Community

Documentation

Branding & Legal