[openstack-dev] [nova] nova-compute blocking main thread under heavy disk IO
jaypipes at gmail.com
Mon Feb 22 16:17:28 UTC 2016
On 02/22/2016 10:43 AM, Chris Friesen wrote:
> Hi all,
> We've recently run into some interesting behaviour that I thought I
> should bring up to see if we want to do anything about it.
> Basically the problem seems to be that nova-compute is doing disk I/O
> from the main thread, and if it blocks then it can block all of
> nova-compute (since all eventlets will be blocked). Examples that we've
> found include glance image download, file renaming, instance directory
> creation, opening the instance xml file, etc. We've seen nova-compute
> block for upwards of 50 seconds.
> Now the specific case where we hit this is not a production
> environment. It's only got one spinning disk shared by all the guests,
> the guests were hammering on the disk pretty hard, the IO scheduler for
> the instance disk was CFQ which seems to be buggy in our kernel.
> But the fact remains that nova-compute is doing disk I/O from the main
> thread, and if the guests push that disk hard enough then nova-compute
> is going to suffer.
> Given the above...would it make sense to use eventlet.tpool or similar
> to perform all disk access in a separate OS thread? There'd likely be a
> bit of a performance hit, but at least it would isolate the main thread
> from IO blocking.
This is probably a good idea, but will require quite a bit of code
change. I think in the past we've taken the expedient route of just
exec'ing problematic code in a greenthread using utils.spawn().
More information about the OpenStack-dev