Open Stack

Mon Feb 22 16:17:28 UTC 2016

On 02/22/2016 10:43 AM, Chris Friesen wrote:
> Hi all,
>
> We've recently run into some interesting behaviour that I thought I
> should bring up to see if we want to do anything about it.
>
> Basically the problem seems to be that nova-compute is doing disk I/O
> from the main thread, and if it blocks then it can block all of
> nova-compute (since all eventlets will be blocked).  Examples that we've
> found include glance image download, file renaming, instance directory
> creation, opening the instance xml file, etc.  We've seen nova-compute
> block for upwards of 50 seconds.
>
> Now the specific case where we hit this is not a production
> environment.  It's only got one spinning disk shared by all the guests,
> the guests were hammering on the disk pretty hard, the IO scheduler for
> the instance disk was CFQ which seems to be buggy in our kernel.
>
> But the fact remains that nova-compute is doing disk I/O from the main
> thread, and if the guests push that disk hard enough then nova-compute
> is going to suffer.
>
> Given the above...would it make sense to use eventlet.tpool or similar
> to perform all disk access in a separate OS thread?  There'd likely be a
> bit of a performance hit, but at least it would isolate the main thread
> from IO blocking.

This is probably a good idea, but will require quite a bit of code 
change. I think in the past we've taken the expedient route of just 
exec'ing problematic code in a greenthread using utils.spawn().

Best,
-jay

[1]

Open Stack

[openstack-dev] [nova] nova-compute blocking main thread under heavy disk IO

OpenStack

Community

Documentation

Branding & Legal