[openstack-dev] [nova] instances stuck with task_state of REBOOTING

Solly Ross sross at redhat.com
Fri Mar 21 14:41:06 UTC 2014


Well, if messages are getting dropped on the floor due to communication issues, that's not a good thing.
If you have time, could you determine why the messages are getting dropped on the floor?  We shouldn't be
doing things that require both the controller and compute nodes until we have a connection.

Best Regards,
Solly Ross

----- Original Message -----
From: "Chris Friesen" <chris.friesen at windriver.com>
To: openstack-dev at lists.openstack.org
Sent: Thursday, March 20, 2014 2:59:55 PM
Subject: Re: [openstack-dev] [nova] instances stuck with task_state of	REBOOTING

On 03/20/2014 12:29 PM, Chris Friesen wrote:

> The fact that there are no success or error logs in nova-compute.log
> makes me wonder if we somehow got stuck in self.driver.reboot().
>
> Also, I'm kind of wondering what would happen if nova-compute was
> running reboot_instance() and we rebooted the controller at the same
> time.  reboot_instance() could time out trying to update the instance
> with the the new power state and a task_state of None.  Later on in
> _sync_power_states() we would update the power_state, but nothing would
> update the task_state.  I don't think this is what happened to us though
> since I'd expect to see logs of the timeout.

Actually, looking at the logs a bit more carefully it appears that what 
happened is something like this:

We reboot the controllers.
Right after they come back up something calls compute.api.API.reboot()
That sets instance.task_state = task_states.REBOOTING and then calls 
instance.save() to update the database.
Then it calls self.compute_rpcapi.reboot_instance() which does an rpc cast.
That message gets dropped on the floor due to communication issues 
between the controller and the compute.
Now we're stuck with a task_state of REBOOTING.


I think that both of the RPC message loss scenarios are valid with 
current nova code, so we really do need an audit to clean up after this 
sort of thing.

Chris



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list