Hi Damian,
With our setup we started with 4800 IOPS and ~0.3ms latency with standard settings and went to 17,8K IOPS with ~0.054ms latency after some optimizations. Here is what settings resulted in different performance data:
- change I/O scheduler to noop in VM (echo 'noop' > /sys/block/sda/queue/scheduler)
Thank you for the info! It appears that the "noop" scheduler merges requests, so you are likely getting between 3 and 4 I/O command merges per command to go from 4800 to 17800 IOPS. I'll have to check on this to see if that changes anything on this end, since I thought that the default scheduler also performed command merging. Regarding sleep states, you may want to look at the power management functions in the BIOS. If you have "energy efficient" settings, this will definitely have an impact on latency, but as you noticed, the governor can also override some of these sleep states if you set it to performance. We did a little more testing with iothreads on our Proxmox systems, since it is easy to enable/disable this on a virtual disk. The performance difference on both a relatively idle compute node and VM is extremely small (barely noticeable). With a busy VM, it may make a difference, but we haven't had time to test. So, all the work involved in enabling iothreads in OpenStack may not be worth it. One of our storage vendors had done some testing as well long ago, and they indicated that, to benefit from iothreads, dedicated cores should be used for the iothreads, which creates a bit more resource allocation complexity in OpenStack, especially if live migration is required. Eric