[openstack-dev] [nova] How to fix the race condition issue between deleting and soft reboot?

John Garbutt john at johngarbutt.com
Mon Nov 11 12:11:33 UTC 2013


It seems we still agreed that terminate should be able to happen at any time.

I thought I remembered some code in the manager that treats
InstanceNotFound errors differently.

I would rather we ensure InstanceNotFound is raised to indicate we
have hit this race condition, and let the compute manager unify how we
deal with that across all sorts of operations.

John

On 11 November 2013 02:57, Wangpan <hzwangpan at corp.netease.com> wrote:
> Hi all,
>
> I want to re-ask this problem after the Hongkong summit, you may have time
> to discuss this issue now.
> Thanks a lot!
>
> 2013-11-11
> ________________________________
> Wangpan
> ________________________________
> 发件人:"Wangpan"<hzwangpan at corp.netease.com>
> 发送时间:2013-11-04 12:08
> 主题:[openstack-dev] [nova] How to fix the race condition issue between
> deleting and soft reboot?
> 收件人:"OpenStack Development Mailing List (not for usage
> questions)"<openstack-dev at lists.openstack.org>
> 抄送:
>
> Hi all,
>
> I have a question about fixing a race condition issue between deleting and
> soft reboot,
> the issue is that:
> 1. If we soft reboot an instance, and then delete it, the instance may not
> be deleted and stand on deleting task state, this is because the bug below,
> https://bugs.launchpad.net/nova/+bug/1111213
> and I have fixed this bug yet several months ago(just for libvirt driver).
> 2. The other issue is, if the instance is rebooted just before deleting the
> files under instance dir, then it may become to a running deleted one, and
> this bug is at below:
> https://bugs.launchpad.net/nova/+bug/1246181
> I want to fix it now, and I need your advice.
> The commit is here: https://review.openstack.org/#/c/54477/ , you can post
> your advice on gerrit or mail to me.
>
> The ways to fix bug #2 may be these(just for libvirt driver in my mind):
> 1. Add a lock to reboot operation like the deleting operation, so the reboot
> operation and the delete operation will be done in sequence.
> But on the other hand, the soft reboot operation may cost 120s if the
> instance doesn't support graceful shutdown, I think it is too long for a
> user to delete an instance, so this may not be the best way.
> 2. Check the instance state at the last of _cleanup method in libvirt
> driver, and if it is still running, destroy it again.
> This way is usable but both Nikola Dipanov and I don't like this 'ugly' way.
> 3. Check the instance vm state and task state in nova db before booting in
> reboot, if it is deleted/deleting, stop the reboot process, this will access
> db at driver level, it is a 'ugly' way, too.
>
> Nikola suggests that 'maybe we can leverage task/vm states and refactor how
> reboot is done so we can back out of a reboot on a delete', but I think we
> should let user delete an instance at any time and any state, so the delete
> operation during 'soft reboot' may not be forbidden.
>
> Thanks and waiting for your voice!
>
> 2013-11-04
> ________________________________
> Wangpan
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list