[cinder][nova][ceph] Cinder encryption happens at QEMU layer, not librbd — significant overhead, seeking direction
Hi all, We're running OpenStack with Ceph RBD as the Cinder storage backend and need encryption at rest with per-tenant key isolation. We've evaluated two approaches: 1. Cinder volume encryption Per-volume keys stored in OpenBao via Castellan. Cinder formats the RBD image with a LUKS header and QEMU handles decryption at attach time. The problem is that encryption always happens at the QEMU layer — Nova's libvirt driver doesn't set the engine to librbd, so encryption/decryption never happens inside librbd where it would be most efficient. It stays at the QEMU block layer, adding significant overhead. We're seeing 60-80% IOPS reduction in our testing. 2. Ceph librbd-native LUKS encryption Encryption happens inside librbd itself — client-side, per-image keys, data encrypted before it leaves the compute node. This is architecturally the right place for Ceph to do encryption. But there's no integration with external key managers — rbd_encryption_load() requires the passphrase directly from the caller, making it impractical for multi-tenant environments that need centralized key management. The gap: - Cinder encryption: per-tenant keys ✓, KMS integration ✓, performance ✗ (encryption at QEMU layer, not librbd) - librbd-native: performance ✓, per-image keys ✓, KMS integration ✗ What's the community's recommended approach for Ceph + OpenStack deployments that need per-tenant encryption keys with acceptable performance? Are we missing something, or is this a known gap? Happy to contribute if there's a path forward. Thanks, Ravi Sasi Tilak
Apologies for top posting, but I just want to say that this would be a good topic for the PTG next week; you can propose something on the cinder planning etherpad: https://etherpad.opendev.org/p/cinder-hibiscus-ptg We would be interested in some more information about what you are seeing and how you are measuring it. There was an issue similar to this back in roughly the ussuri timeframe that was dealt with by a nova workaround: https://review.opendev.org/c/openstack/nova/+/708030 that was deprecated in wallaby: https://review.opendev.org/c/openstack/nova/+/778004 and removed in xena: https://review.opendev.org/c/openstack/nova/+/805647 I don't recall there being any objections to the removal; it seems like the performance issues in libgcrypt had been addressed and everyone was happy (or at least silent). As I said earlier, we would be very interested in how you are benchmarking. You mention that doing the encryption in the librbd layer instead of the qemu layer would be more efficient, but it seems to me like it would be roughly the same since the encryption has to happen somewhere on the host. And, of course, we are extremely interested in your offer to contribute! cheers, brian On 4/11/26 8:54 AM, Ravi Sasi Tilak wrote:
Hi all,
We're running OpenStack with Ceph RBD as the Cinder storage backend and need encryption at rest with per-tenant key isolation.
We've evaluated two approaches:
1. Cinder volume encryption
Per-volume keys stored in OpenBao via Castellan. Cinder formats the RBD image with a LUKS header and QEMU handles decryption at attach time. The problem is that encryption always happens at the QEMU layer — Nova's libvirt driver doesn't set the engine to librbd, so encryption/ decryption never happens inside librbd where it would be most efficient. It stays at the QEMU block layer, adding significant overhead. We're seeing 60-80% IOPS reduction in our testing.
2. Ceph librbd-native LUKS encryption
Encryption happens inside librbd itself — client-side, per-image keys, data encrypted before it leaves the compute node. This is architecturally the right place for Ceph to do encryption. But there's no integration with external key managers — rbd_encryption_load() requires the passphrase directly from the caller, making it impractical for multi-tenant environments that need centralized key management.
The gap:
- Cinder encryption: per-tenant keys ✓, KMS integration ✓, performance ✗ (encryption at QEMU layer, not librbd) - librbd-native: performance ✓, per-image keys ✓, KMS integration ✗
What's the community's recommended approach for Ceph + OpenStack deployments that need per-tenant encryption keys with acceptable performance? Are we missing something, or is this a known gap?
Happy to contribute if there's a path forward.
Thanks, Ravi Sasi Tilak
On 15/04/2026 04:15, Brian Rosmaita wrote:
Apologies for top posting, but I just want to say that this would be a good topic for the PTG next week; you can propose something on the cinder planning etherpad: https://etherpad.opendev.org/p/cinder-hibiscus-ptg
We would be interested in some more information about what you are seeing and how you are measuring it. There was an issue similar to this back in roughly the ussuri timeframe that was dealt with by a nova workaround: https://review.opendev.org/c/openstack/nova/+/708030 that was deprecated in wallaby: https://review.opendev.org/c/openstack/nova/+/778004 and removed in xena: https://review.opendev.org/c/openstack/nova/+/805647
I don't recall there being any objections to the removal; it seems like the performance issues in libgcrypt had been addressed and everyone was happy (or at least silent). there was an objection to usign kernel rbd when it was intoduced. it was added purly as a workaround but when you used krbd you lost abunch of functionatly like the ablity to resize cinder volumes connected to a live vm. there are some other operational issue but i dont remember what hose were off the top of my head. i think krbd is less likely to lock up the host kernel in the same way that hardmounting an nfs volume can fi there is a netowrk outage but i know that there were some limitaiton in terems of feature supprot between the too. that said its been many year now since we looked at this so those issue may have all been adresed in modern implementions.
in general using krbd is less secure because when the vm is stop the encypted volume is unencypted on teh compute node since we by desgin do not detach the volume form the host when the guest is stopped if i recal. also using cryptsetup and krbd manes that while the vm is runnign the blockdevice is decypted on the host and could be mounted say readonly by a bad actore to view the contet in a much eaiser way then if the decyption is only done in qemu. so ya those were all reasons this was never officlaly supproted as a feature in nova and consier only a workaround.
As I said earlier, we would be very interested in how you are benchmarking. You mention that doing the encryption in the librbd layer instead of the qemu layer would be more efficient, but it seems to me like it would be roughly the same since the encryption has to happen somewhere on the host.
the issue that folks sometimes still face after the upgrade to newer qemus is for the performance to improve you need both a newer qemu, librados and a sup with the required feature flags the kernel implementation worked on older hardware then the native implementation.
And, of course, we are extremely interested in your offer to contribute!
if we can come up with a solution (or if the operation issue with krbd/gcrypt are adressed) we coudl revaluate it. os-brick still has supprot for the hsot mounting i belive. so i coudl see perhaps a new atibute on the cinder volume to opt into it perhasp denoted vai the volume type so end users can chose between security(encyption at rest/decypriton data available at runtime) vs performance vs extensibility at runtime. we do eventually want to finish the work to have nova provision storage with encyption but i suspect that wont be for a release or two.
cheers, brian
On 4/11/26 8:54 AM, Ravi Sasi Tilak wrote:
Hi all,
We're running OpenStack with Ceph RBD as the Cinder storage backend and need encryption at rest with per-tenant key isolation.
We've evaluated two approaches:
1. Cinder volume encryption
Per-volume keys stored in OpenBao via Castellan. Cinder formats the RBD image with a LUKS header and QEMU handles decryption at attach time. The problem is that encryption always happens at the QEMU layer — Nova's libvirt driver doesn't set the engine to librbd, so encryption/ decryption never happens inside librbd where it would be most efficient. It stays at the QEMU block layer, adding significant overhead. We're seeing 60-80% IOPS reduction in our testing.
this could be an intersting middel ground. if librbd provide better perfomance then qemu without needing krbd which ahs the operational issues we might be able to add this as a new feature/option last tiem we looked at this there were only 3 optoins native qemu encyption + native qemu rbd transport via librbd native qemu encyption + host mounted volume vai krbd cryptsetup host level decyrption + krbd from top to bottom the security profile get worse and the performnace goes up. if we can now do librbd native encyption + librbd tansport entirly in userspace with no host moutning that would give you the security of what we have today with hopefly some or all of the perfomace of the later options. that is worth exploreing if you can provide pointedre to how to cofnigure this.
2. Ceph librbd-native LUKS encryption
Encryption happens inside librbd itself — client-side, per-image keys, data encrypted before it leaves the compute node. This is architecturally the right place for Ceph to do encryption. But there's no integration with external key managers — rbd_encryption_load() requires the passphrase directly from the caller, making it impractical for multi-tenant environments that need centralized key management.
The gap:
- Cinder encryption: per-tenant keys ✓, KMS integration ✓, performance ✗ (encryption at QEMU layer, not librbd) - librbd-native: performance ✓, per-image keys ✓, KMS integration ✗
What's the community's recommended approach for Ceph + OpenStack deployments that need per-tenant encryption keys with acceptable performance? Are we missing something, or is this a known gap?
Happy to contribute if there's a path forward.
Thanks, Ravi Sasi Tilak
On 2026-04-11 18:24:26 (UTC+0530) Ravi Sasi Tilak <rsasitilak0987@gmail.com> wrote:
Hi all,
We're running OpenStack with Ceph RBD as the Cinder storage backend and need encryption at rest with per-tenant key isolation.
We've evaluated two approaches:
1. Cinder volume encryption
Per-volume keys stored in OpenBao via Castellan. Cinder formats the RBD image with a LUKS header and QEMU handles decryption at attach time. The problem is that encryption always happens at the QEMU layer — Nova's libvirt driver doesn't set the engine to librbd, so encryption/ decryption never happens inside librbd where it would be most efficient. It stays at the QEMU block layer, adding significant overhead. We're seeing 60-80% IOPS reduction in our testing.
2. Ceph librbd-native LUKS encryption
Encryption happens inside librbd itself — client-side, per-image keys, data encrypted before it leaves the compute node. This is architecturally the right place for Ceph to do encryption. But there's no integration with external key managers — rbd_encryption_load() requires the passphrase directly from the caller, making it impractical for multi-tenant environments that need centralized key management.
The gap:
- Cinder encryption: per-tenant keys ✓, KMS integration ✓, performance ✗ (encryption at QEMU layer, not librbd) - librbd-native: performance ✓, per-image keys ✓, KMS integration ✗
What's the community's recommended approach for Ceph + OpenStack deployments that need per-tenant encryption keys with acceptable performance? Are we missing something, or is this a known gap? From my perspective, this is a known gap. If I'm understanding correctly, you're talking about librbd engine and RBD layered encryption support in libvirt [1].
If I'm not understanding correctly, please ignore the rest of my reply :) I have actually implemented this before in a past attempt two years ago to add "local" disk encryption in Nova [2] and I am familiar with the librbd engine and layered encryption's ability to have a different key per image layer. Prior to layered encryption, a parent image and child image had to have the same key. Obviously Nova local disk encryption never got off the ground last time we tried. The complexity is high and it was extremely difficult to get code review. Anyway, I am generally interested in this and happy to help if there is some way I could be helpful. -melwitt [1] https://libvirt.org/formatstorageencryption.html [2] https://review.opendev.org/c/openstack/nova/+/889912
Hi all, I want to correct something from my original post: I said "60–80% overhead" but that figure came from a flawed benchmark — there were competing processes on the host and no CPU pinning. Here are results from a controlled run. Test setup: - Compute host: HPE server, 128 logical CPUs; CPUs 64–79 reserved as Nova cpu_dedicated_set - VM: Ubuntu 24.04, 4 vCPU (pinned 1:1 to host CPUs 66/70/71/75), 4 GiB, flavor hw:cpu_policy=dedicated - fio invoked as taskset -c 0-3 fio inside the VM to bind to the pinned set - 3× 20 GiB volumes, freshly preconditioned: vdb plain, vdc qemu-LUKS, vdd librbd-LUKS (cold-attached) - fio: --rw=read|write --bs=4k|16k|1m --iodepth=1|8|16|32 --runtime=30 --ramp_time=5 --ioengine=libaio --direct=1 Sequential reads: 1M QD=1: plain 176 MiB/s → qemu 105 MiB/s (-40%) → librbd 155 MiB/s (-12%) 1M QD=8: plain 349 MiB/s → qemu 199 MiB/s (-43%) → librbd 347 MiB/s (-0.5%) 1M QD=16: plain 349 MiB/s → qemu 267 MiB/s (-23%) → librbd 347 MiB/s (-0.3%) 16K QD=8: plain 7.3k IOPS → qemu 5.8k IOPS (-20%) → librbd 7.4k IOPS (+1%) 4K QD=8: plain 10.8k IOPS → qemu 9.4k IOPS (-13%) → librbd 10.5k IOPS (-2.5%) 4K QD=16: plain 19.1k IOPS → qemu 16.9k IOPS (-12%) → librbd 18.7k IOPS (-2.4%) Sequential writes: 1M QD=1: plain 62.9 MiB/s → qemu 47.4 MiB/s (-25%) → librbd 57.1 MiB/s (-9%) 1M QD=8: plain 64.7 MiB/s → qemu 49.7 MiB/s (-23%) → librbd 59.5 MiB/s (-8%) 1M QD=16: plain 64.5 MiB/s → qemu 49.9 MiB/s (-23%) → librbd 60.1 MiB/s (-7%) 16K QD=8: plain 631 IOPS → qemu 583 IOPS (-8%) → librbd 642 IOPS (+2%) 4K QD=8: plain 714 IOPS → qemu 681 IOPS (-5%) → librbd 710 IOPS (-1%) 4K QD=16: plain 722 IOPS → qemu 696 IOPS (-4%) → librbd 690 IOPS (-4%) The overhead is real at large block sizes — the 1M range is where databases and storage-heavy workloads operate. The reason isn't that encryption has to happen somewhere (it does) — it's about where the parallelism lives. QEMU decrypts/encrypts serially on a single thread; librbd does it per-object across multiple threads. At 4K the difference is noise; at 1M QD=8 the qemu path loses 43% on reads and 23% on writes, while librbd is within 1% on reads and 8% on writes. On Sean's concern about krbd: the librbd path stays entirely in userspace (QEMU → librbd → messenger). No kernel RBD, no elevated privilege surface. Melanie — your prior work on Nova local disk encryption [2] is directly relevant here. The placement of <encryption engine='librbd'> under <source> rather than <disk> is exactly the libvirt model you were working with. We have two patches up for review: - Cinder: https://review.opendev.org/986322 - Nova: https://review.opendev.org/986323 The Cinder patch adds rbd_encryption_engine as a volume-type extra spec (default qemu, preserving existing behaviour). The Nova patch reads that from connection info and places <encryption engine='librbd'> under <source> instead of <disk>. Would appreciate eyes on both. Thanks, Ravi
participants (5)
-
Brian Rosmaita
-
melanie witt
-
Ravi Sasi Tilak
-
rsasitilak0987@gmail.com
-
Sean Mooney