Hello,
For better performance, you can use SSDs for RocksDB/WAL. By default, Ceph stores RocksDB/WAL on the same HDD as the OSD data, but you can configure Ceph to place RocksDB/WAL on SSDs. A single SSD can be partitioned for multiple HDDs, with each partition storing the RocksDB/WAL of an HDD OSD — so you don’t need one SSD per HDD.
On Tue, Sep 16, 2025 at 8:28 PM Eugen Block <eblock@nde.ag> wrote:Hi,
I'd say this question is more suitable for the ceph-users mailing
list, but since Ceph is quite popular as an OpenStack storage back
end, you'll probably get helpful responses here as well. ;-)
More responses inline...
Zitat von William Muriithi <wmuriithi@perasoinc.com>:
> Hello,
>
> We want to use ceph as the openstack storage system and we can
> afford a purely SSD based storage system. So we are planning to
> just setup meta data on SSD and leave the data on HDD
>
> The documentation around this isn't very clear and wonder if someone
> can explain a bit
>
> Here is what the documentation say:-
>
> https://docs.ceph.com/en/reef/start/hardware-recommendations
>
> DB/WAL (optional)
> 1x SSD partion per HDD OSD 4-5x HDD OSDs per DB/WAL SATA SSD <= 10
> HDD OSDss per DB/WAL NVMe SSD
> What does this mean? I am sorry to say, but it look a tad ambiguous
> to me, but suspect its obvious once one have experience.
You should not put the DBs of more than 10 HDD OSDs on one SSD, a bit
lower is better, but it really depends on the actual workload etc. In
your case with 6 OSDs in total per node, you can safely put all 6 DB
devices on one SSD.
> I have 6 8TB disk per system, and I want to use replication so will
> end up with 18 OSD
This sounds like you're planning to have 3 nodes in total (replication
size 3, good). But note that in case a node is in maintenance and one
more node goes down, you'll have a service interruption since monitor
quorum won't be possible until at least one more node comes back. Or
consider this: you lose one entire node (hardware failure), then your
PGs can't recover anywhere and stay degraded until a third node is
available again.
We've been running our own Ceph cluster for many years with this exact
setup, three nodes in total, and we never had any issues. But I just
want to raise awareness because many (new to Ceph) operators aren't
really considering these possibilities.
> I am hoping I don't need 18 SSD as we don't even have enough bays
No, you definitely don't need that many SSDs.
> If we can add in 2 800GB SSD per hardware, how do we optimally map
> those 18 DW/WAL to a total of 6 SSD disks?
As I wrote above, one SSD per node should be sufficient to put 6 DBs on it.
>
> Regards,
> William