Hello,

For better performance, you can use SSDs for RocksDB/WAL. By default, Ceph stores RocksDB/WAL on the same HDD as the OSD data, but you can configure Ceph to place RocksDB/WAL on SSDs. A single SSD can be partitioned for multiple HDDs, with each partition storing the RocksDB/WAL of an HDD OSD — so you don’t need one SSD per HDD.


On Tue, Sep 16, 2025 at 8:28 PM Eugen Block <eblock@nde.ag> wrote:
Hi,

I'd say this question is more suitable for the ceph-users mailing 
list, but since Ceph is quite popular as an OpenStack storage back 
end, you'll probably get helpful responses here as well. ;-)

More responses inline...

Zitat von William Muriithi <wmuriithi@perasoinc.com>:

> Hello,
>
> We want to use ceph as the openstack storage system and we can 
> afford a purely SSD based storage system.  So we are planning to 
> just setup meta data on SSD and leave the data on HDD
>
> The documentation around this isn't very clear and wonder if someone 
> can explain a bit
>
> Here is what the documentation say:-
>
> https://docs.ceph.com/en/reef/start/hardware-recommendations
>
> DB/WAL (optional)
> 1x SSD partion per HDD OSD 4-5x HDD OSDs per DB/WAL SATA SSD <= 10 
> HDD OSDss per DB/WAL NVMe SSD
> What does this mean?  I am sorry to say, but it look a tad ambiguous 
> to me, but suspect its obvious once one have experience.

You should not put the DBs of more than 10 HDD OSDs on one SSD, a bit 
lower is better, but it really depends on the actual workload etc. In 
your case with 6 OSDs in total per node, you can safely put all 6 DB 
devices on one SSD.

> I have 6 8TB disk per system, and I want to use replication so will 
> end up with 18 OSD

This sounds like you're planning to have 3 nodes in total (replication 
size 3, good). But note that in case a node is in maintenance and one 
more node goes down, you'll have a service interruption since monitor 
quorum won't be possible until at least one more node comes back. Or 
consider this: you lose one entire node (hardware failure), then your 
PGs can't recover anywhere and stay degraded until a third node is 
available again.

We've been running our own Ceph cluster for many years with this exact 
setup, three nodes in total, and we never had any issues. But I just 
want to raise awareness because many (new to Ceph) operators aren't 
really considering these possibilities.

> I am hoping I don't need 18 SSD as we don't even have enough bays

No, you definitely don't need that many SSDs.

> If we can add in 2 800GB SSD per hardware, how do we optimally map 
> those 18 DW/WAL to a total of 6 SSD disks?

As I wrote above, one SSD per node should be sufficient to put 6 DBs on it.

>
> Regards,
> William