[openstack-dev] [nova] nova-compute blocking main thread under heavy disk IO
chris.friesen at windriver.com
Mon Feb 22 15:43:29 UTC 2016
We've recently run into some interesting behaviour that I thought I should bring
up to see if we want to do anything about it.
Basically the problem seems to be that nova-compute is doing disk I/O from the
main thread, and if it blocks then it can block all of nova-compute (since all
eventlets will be blocked). Examples that we've found include glance image
download, file renaming, instance directory creation, opening the instance xml
file, etc. We've seen nova-compute block for upwards of 50 seconds.
Now the specific case where we hit this is not a production environment. It's
only got one spinning disk shared by all the guests, the guests were hammering
on the disk pretty hard, the IO scheduler for the instance disk was CFQ which
seems to be buggy in our kernel.
But the fact remains that nova-compute is doing disk I/O from the main thread,
and if the guests push that disk hard enough then nova-compute is going to suffer.
Given the above...would it make sense to use eventlet.tpool or similar to
perform all disk access in a separate OS thread? There'd likely be a bit of a
performance hit, but at least it would isolate the main thread from IO blocking.
More information about the OpenStack-dev