[openstack-dev] [Nova] nova-compute deadlock
Richard W.M. Jones
rjones at redhat.com
Sat Jun 7 11:51:01 UTC 2014
On Sat, May 31, 2014 at 01:25:04AM +0800, Qin Zhao wrote:
> Hi all,
> When I run Icehouse code, I encountered a strange problem. The nova-compute
> service becomes stuck, when I boot instances. I report this bug in
> After thinking several days, I feel I know its root cause. This bug should
> be a deadlock problem cause by pipe fd leaking. I draw a diagram to
> illustrate this problem.
> However, I have not find a very good solution to prevent this deadlock.
> This problem is related with Python runtime, libguestfs, and eventlet. The
> situation is a little complicated. Is there any expert who can help me to
> look for a solution? I will appreciate for your help!
Thanks for the useful diagram. libguestfs itself is very careful to
open all file descriptors with O_CLOEXEC (atomically if the OS
supports that), so I'm fairly confident that the bug is in Python 2,
not in libguestfs.
Another thing to say is that g.shutdown() sends a kill 9 signal to the
subprocess. Furthermore you can obtain the qemu PID (g.get_pid()) and
send any signal you want to the process.
I wonder if a simpler way to fix this wouldn't be something like
adding a tiny C extension to the Python code to use pipe2 to open the
Python pipe with O_CLOEXEC atomically? Are we allowed Python
extensions in OpenStack?
BTW do feel free to CC libguestfs at redhat.com on any libguestfs
problems you have. You don't need to subscribe to the list.
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
More information about the OpenStack-dev