[nova] iothread support with Libvirt

Eric K. Miller emiller at genesishosting.com
Thu Jan 6 20:20:39 UTC 2022

Hi Sean,

Thanks, as always, for your reply!

> hi up until recently the advice from our virt team was that iotread where not
> really needed
> for openstack howver in the last 6 weeks they have actully asked us to
> consider enabling them

I don't have the data to know whether iothread improves performance or not.  Rather, I made the assumption that a dedicated core for I/O would likely perform much better than without.  If someone has any data on this, it would be extremely useful.

The issue we are trying to resolve is related to high-speed local storage performance that is literally 10x, and sometimes 15x, slower in a VM than the host.  The local storage can reach upwards of 8GiB/sec and 1 million IOPS.

It's not necessarily throughput we're after, though - it is latency, and the high latency in QEMU/KVM is simply too high to get adequate storage performance inside a VM.

If iothread(s) do not help, then the point of implementing the parameter in Nova is probably moot.

> so work will be happening in qemu/libvirt to always create at least one
> iothread going forward and
> affinies it to the same set of cores as the emulator threads by default.

That sounds like a good idea, although I did read somewhere in the QEMU docs that not all drivers support iothreads, and trying to use them with unsupported drivers will likely crash QEMU - but I don't know how old those docs were.  It seems reasonable since the "big QEMU lock" is not being used for the io thread(s).

> we dont have a downstream rfe currently filed for ioithread specifically but
> we do virtio scsi multi queue support
> https://bugzilla.redhat.com/show_bug.cgi?id=1880273

I found this old blueprint and implementation (that apparently was never accepted due to tests failing in various environments):

> to do together. my understanding is that without iothread multi queue virtio
> scsi does not provide as much of
> a perfromace boost as with io threads.

I can imagine that being the case - since a spinning loop has super-low latency compared to an interrupt.

> if you our other have capasity to work on this i would be happy to work on a
> spec with ye to enable it.

I wish I had the bandwidth to learn how, but since I'm not a Python developer, nor have a development environment ready to go (I'm mostly performing cloud operator and business support functions), I probably couldn't help much other than provide feedback.

> effectivly what i was plannign to propose if we got around to it is adding a
> new config option
> cpu_iothread_set which would default to the same value as cpu_share_set.
> this effectivly will ensure that witout any config updates all existing
> deployment will start benifiting
> form iothreads and allow you to still dedicate a set of cores to running the
> iothread seperate form the cpu_share_set
> if you wasnt this to also benifit floating vms not just pinned vms.

I would first suggest asking the QEMU folks whether there are incompatibilities with iothreads with storage drivers that could cause issues by enabling iothreads by default.  I suggest a more cautionary approach and leave the default as-is and allow a user to enable iothreads themselves.  The default could always be changed later if there isn't any negative feedback from those who tried using iothreads.

> in addtion to that a new flavor extra spec/image property woudl be added
> similar to cpu_emultor_threads.
> im not quite sure how that extra spec should work but either
> hw:cpu_iotread_policy woudl either support the same vales as
> hw:cpu_emulator_threads where
> hw:cpu_iotread_policy=shared woudl allocate an iotread that floats over the
> cpu_iothread_set (which is the same as cpu_shared_set by default)
> and hw:cpu_iotread_policy=isolate would allocate an addtional iothread
> form the cpu_dedicated_set.
> hw:cpu_iotread_policy=share woudl be the default behavior if
> cpu_shared_set or cpu_iothread_set was defined in the config and not
> flavor extra
> spec or image property was defiend. basically all vms woudl have at least 1
> iothread that floated over teh shared pool if a share pool was configured
> on the host.

I will have to review this more carefully, when I have a bit more time. 
> that is option a
> option b woudl be to allso support
> hw:cpu_iotread_count so you could ask for n iothread eitehr form the
> shared/iothread set or dedicated set depending on the value of
> hw:cpu_iotread_policy
> im not really sure if there is a need for more the 1 io thread. my
> understanding is that once you have at least 1 there is demising retruns.
> it will improve your perfoamce if you have more propvided you have multiple
> disks/volumes attached but not as much as having the initall iotread.

I would guess that multiple io threads would benefit multiple VMs, where each VM would use its own I/O thread/dedicated core.  So, I think providing the possibility for multiple iothreads should be considered, with assignment of these threads to individual VMs.  However, this brings up a significantly more complex resource allocation requirement, much less resource allocation during live migration.

> is this something you wold be willing to work on and implement?
> i woudl be happy to review any spec in this areay and i can bring it up
> downstream again but i cant commit to working on this in the z release.
> this would require some minor rpc chagnes to ensure live migrate work
> properly as the iothread set or cpu share set could be different on different
> hosts. but beyond that the feature is actully pretty simple to enable.

I think we need to do some testing to prove the performance benefits first - before spending the time to implement.

> no there is no way to enable them out of band of nova today.
> you technially could wrap the qemu binary wiht a script that inject parmaters
> but that obviously would not be supported upstream.
> but that would be a workaround if you really needed it
> https://review.opendev.org/c/openstack/devstack/+/817075 is an exmaple
> of such a script
> that break apparmor and selinx but you could proably make it work with
> enough effort.
> although i woudl sugess just implemeting the feature upstream and downing
> a downstream backport instead.

Interesting - maybe I can hack this for testing and proof-of-concept purposes.  Thanks for the suggestion!  I'll see if we can figure out how to test iothreads in our environment where the high-speed local storage exists.


More information about the openstack-discuss mailing list