[openstack-dev] [nova] nova-compute blocking main thread under heavy disk IO

Chris Friesen chris.friesen at windriver.com
Mon Feb 22 19:13:39 UTC 2016

On 02/22/2016 11:20 AM, Daniel P. Berrange wrote:
> On Mon, Feb 22, 2016 at 12:07:37PM -0500, Sean Dague wrote:
>> On 02/22/2016 10:43 AM, Chris Friesen wrote:

>>> But the fact remains that nova-compute is doing disk I/O from the main
>>> thread, and if the guests push that disk hard enough then nova-compute
>>> is going to suffer.
>>> Given the above...would it make sense to use eventlet.tpool or similar
>>> to perform all disk access in a separate OS thread?  There'd likely be a
>>> bit of a performance hit, but at least it would isolate the main thread
>>> from IO blocking.
>> Making nova-compute more robust is fine, though the reality is once you
>> IO starve a system, a lot of stuff is going to fall over weird.
>> So there has to be a tradeoff of the complexity of any new code vs. what
>> it gains. I think individual patches should be evaluated as such, or a
>> spec if this is going to get really invasive.
> There are OS level mechanisms (eg cgroups blkio controller) for doing
> I/O priorization that you could use to give Nova higher priority over
> the VMs, to reduce (if not eliminate) the possibility that a busy VM
> can inflict a denial of service on the mgmt layer.  Of course figuring
> out how to use that mechanism correctly is not entirely trivial.

The 50+ second delays were with CFQ as the disk scheduler.  (No cgroups though, 
just CFQ with equal priorities on nova-compute and the guests.)  This was with a 
3.10 kernel though, so maybe CFQ behaves better on newer kernels.

If you put nova-compute at high priority then glance image downloads, qemu-img 
format conversions, and volume clearing will also run at the higher priority, 
potentially impacting running VMs.

In an ideal world we'd have per-VM cgroups and all activity on behalf of a 
particular VM would be done in the context of that VM's cgroup.


