[nova]is there support for discard/trim in virtio-blk?
Hi, according to our research discard/trim is commonly documented as "not supported by virtio-blk" in the wider community and in openstack nova[0]. I understand that this was not supported by the kernel or qemu when discard support was initially implemented[1] in openstack, but times have changed. Virtio-blk in upstream Kernel[2] and in qemu[3] does clearly support discard/trim, which we discovered thanks to StackExchange[4]. So my question is, has someone successfully used trim/discard with virtio-blk in nova provisioned vms? [1] makes me guess, that there is no codepath that works with virtio-blk, but I'm not sure I understood all the codepaths in nova yet. we are still working our way through it, though. We currently use the train release in conjunction with ceph rbd volumes for testing this feature but weren't yet able to use it successfully. Looking at nova's master branch not much seems to have changed regarding trim support. If this is no supported configuration currently, should I submit a blueprint to enable this feature? Would there be other devs/ops interested in this? Any help or pointers would be appreciated. [0]: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2... (nova debug log saying discard is unsupported for virtio) [1]: https://blueprints.launchpad.net/nova/+spec/cinder-backend-report-discard (initial implementation of discard support in openstack) [2]: https://github.com/torvalds/linux/commit/1f23816b8eb8fdc39990abe166c10a18c16... (available since linux 5.0-rc1) [3]: https://github.com/qemu/qemu/commit/37b06f8d46fe602e630e4bdce24e80a3e0f70cc2 (available since qemu 4.0.0) [4]: https://unix.stackexchange.com/a/518223 -- Mit freundlichen Grüßen / Regards Sven Kieske Systementwickler / systems engineer Mittwald CM Service GmbH & Co. KG Königsberger Straße 4-6 32339 Espelkamp Tel.: 05772 / 293-900 Fax: 05772 / 293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer, Florian Jürgens St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen Informationen zur Datenverarbeitung im Rahmen unserer Geschäftstätigkeit gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar.
Hi,
according to our research discard/trim is commonly documented as "not supported by virtio-blk" in the wider community and in openstack nova[0].
I understand that this was not supported by the kernel or qemu when discard support was initially implemented[1] in openstack, but times have changed.
Virtio-blk in upstream Kernel[2] and in qemu[3] does clearly support discard/trim, which we discovered thanks to StackExchange[4].
So my question is, has someone successfully used trim/discard with virtio-blk in nova provisioned vms?
On Thu, 2021-09-02 at 16:48 +0000, Sven Kieske wrote: that is a good question. if it works in upstrema qemu it should work with a nova provisioned vm unless there is something we explitly need to add in teh xml to make it work. i suspect it will jsut work and we
[1] makes me guess, that there is no codepath that works with virtio-blk, but I'm not sure I understood all the codepaths in nova yet. we are still working our way through it, though.
virtio-blk is our default storage backend for libvirt. if you dont specifiy otherwise its using virtio-blk, to request it explcitly you set hw_disk_bus=virtio virtio in the image properties mappes to virtio-blk, if you want virtio-scsi you have to request that explictly.
We currently use the train release in conjunction with ceph rbd volumes for testing this feature but weren't yet able to use it successfully. Looking at nova's master branch not much seems to have changed regarding trim support.
in the past to use trim you had to configure virtio-scsi in the past so that was the supported way to enable this when train was released. looking at libcvirt we might need to pas som addtional optional driver argument to make it work https://libvirt.org/formatdomain.html The optional discard attribute controls whether discard requests (also known as "trim" or "unmap") are ignored or passed to the filesystem. The value can be either "unmap" (allow the discard request to be passed) or "ignore" (ignore the discard request). Since 1.0.6 (QEMU and KVM only) The optional detect_zeroes attribute controls whether to detect zero write requests. The value can be "off", "on" or "unmap". First two values turn the detection off and on, respectively. The third value ("unmap") turns the detection on and additionally tries to discard such areas from the image based on the value of discard above (it will act as "on" if discard is set to "ignore"). NB enabling the detection is a compute intensive operation, but can save file space and/or time on slow media. Since 2.0.0 although its not clear that libvirt was updated to support the virtio-blk support added in qemu 4.0.0 sicne the libvirt docs were not updated to refernce that.
If this is no supported configuration currently, should I submit a blueprint to enable this feature?
yes although i think we need to confirm if this is support in libvirt kasyap perhaps you could ask our virt folk interneally and find out if we need to do anythin to enable trim support for virtio-blk? we do have support for seting the driver discard option https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861... and there is an nova config option to enable it. https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.hw_... so perhaps you just need to set that o unmap e.g. /etc/nova/nova.conf: [libvirt] hw_disk_discard=unmap it is used in teh images backend https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861... and we also seam to have support in the volume driver https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861... so i suspect you are just missing setting the config option to unmap and the debug log in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2... is proably outdated and can be removed if yoiu can validate the config option works then can you open a bug for the outdated message?
Would there be other devs/ops interested in this?
Any help or pointers would be appreciated.
[0]: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2... (nova debug log saying discard is unsupported for virtio) [1]: https://blueprints.launchpad.net/nova/+spec/cinder-backend-report-discard (initial implementation of discard support in openstack) [2]: https://github.com/torvalds/linux/commit/1f23816b8eb8fdc39990abe166c10a18c16... (available since linux 5.0-rc1) [3]: https://github.com/qemu/qemu/commit/37b06f8d46fe602e630e4bdce24e80a3e0f70cc2 (available since qemu 4.0.0) [4]: https://unix.stackexchange.com/a/518223
On Thu, Sep 2, 2021 at 1:47 PM Sean Mooney <smooney@redhat.com> wrote:
On Thu, 2021-09-02 at 16:48 +0000, Sven Kieske wrote:
Virtio-blk in upstream Kernel[2] and in qemu[3] does clearly support discard/trim, which we discovered thanks to StackExchange[4].
So my question is, has someone successfully used trim/discard with virtio-blk in nova provisioned vms? and there is an nova config option to enable it.
https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.hw_... so perhaps you just need to set that o unmap e.g. /etc/nova/nova.conf: [libvirt] hw_disk_discard=unmap
With a recent enough qemu on hypervisor, and discard support on the block store, it can technically work. However, the guest was the problem for me. For example, I believe the RHEL/CentOS 8.x / 4.18 kernel does NOT provide support for this capability. But, Linux 5.4 works fine. Other than this, it works fine. I have used it for a while now, and it is just bringing the guests up-to-date that prevented it from being more useful. For older guests, I have a simple script that basically fills the guests file system up to 95% full with zero blocks, and the zero block has the same effect on my underlying storage. Finally, I would caveat that discard should not be a storage management solution. The storage should be the right size, and it should get naturally recycled. But, over time - perhaps every few months, or once a year, there is a consequence that guests don't pass through "free" information to the hypervisor on the regular, and it will capture dead blocks in the underlying storage that never get garbage collected, and this is an unfortunate waste. In our case, SolidFire stores the blocks 3X and it adds up. In five years, I've done the clean up with discard or zero only twice. -- Mark Mielke <mark.mielke@gmail.com>
On Do, 2021-09-02 at 18:44 +0100, Sean Mooney wrote:
although its not clear that libvirt was updated to support the virtio-blk support added in qemu 4.0.0 sicne the libvirt docs were not updated to refernce that.
Thanks for all the feedback and pointers! FWIW we were already aware of the discard=unmap option in libvirt, I should have mentioned that. In the meantime I found this bugreport: https://bugzilla.redhat.com/show_bug.cgi?id=1672682 which seems to indicate that this indeed works in libvirt for virtio-blk. So I will continue to figure out why this doesn't work as intended in our setup. -- Mit freundlichen Grüßen / Regards Sven Kieske Systementwickler / systems engineer Mittwald CM Service GmbH & Co. KG Königsberger Straße 4-6 32339 Espelkamp Tel.: 05772 / 293-900 Fax: 05772 / 293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer, Florian Jürgens St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen Informationen zur Datenverarbeitung im Rahmen unserer Geschäftstätigkeit gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar.
On Fri, Sep 03, 2021 at 08:15:51AM +0000, Sven Kieske wrote: [...]
Thanks for all the feedback and pointers! FWIW we were already aware of the discard=unmap option in libvirt, I should have mentioned that.
In the meantime I found this bugreport: https://bugzilla.redhat.com/show_bug.cgi?id=1672682
which seems to indicate that this indeed works in libvirt for virtio-blk.
So I will continue to figure out why this doesn't work as intended in our setup.
Yes, 'discard' support for 'virtio-blk' was added to Linux and QEMU in the following versions: - Linux: v5.0 onwards - QEMU: v4.0.0 So make sure you have those versions at a minimum. And as you've discovered in the above Red Hat bugzilla, libvirt already has the config option to enable 'discard'; and Nova has the config option too. * * * Some notes on 'discard' (also called as 'trim') that I learnt from my colleague Dan Berrangé: - To genuinely save storage space, you need to enable 'trim' at every single layer of the I/O stack -- guest, host, and storage server(s). - Even if the host storage doesn't support 'discard', it might still be useful to enable it for the guests -- it'll keep your qcow2 file size down by releasing unused clusters, so if you need to copy the qcow2 file to another host there will be less data needing copying. - If you're "thick-provisoning" your guests so that they will never trigger ENOSPC, then you don't want to enable 'discard'. [...] -- /kashyap
participants (4)
-
Kashyap Chamarthy
-
Mark Mielke
-
Sean Mooney
-
Sven Kieske