And one more comment from our own experience: we tried using VMs based on HDD + DB on SSD but it was too slow. We decided to configure a cache tier (SSD only pool) in front of our main pool, that worked quite well for years. But cache tier is deprecated so we had to get rid of it, which we did a couple of months back before we upgraded to Ceph Reef. We moved almost all our data pools to SSDs during that process. So the main question is: can you be sure that HDD + SSD for DB will suffice your performance requirements? One of the best things about Ceph is how flexible it is, you can reshape it anytime. So if you start with this HDD + SSD mix and it works for you, great! If it's too slow, you can reconfigure it according to your needs. Zitat von Eugen Block <eblock@nde.ag>:
Hi,
I'd say this question is more suitable for the ceph-users mailing list, but since Ceph is quite popular as an OpenStack storage back end, you'll probably get helpful responses here as well. ;-)
More responses inline...
Zitat von William Muriithi <wmuriithi@perasoinc.com>:
Hello,
We want to use ceph as the openstack storage system and we can afford a purely SSD based storage system. So we are planning to just setup meta data on SSD and leave the data on HDD
The documentation around this isn't very clear and wonder if someone can explain a bit
Here is what the documentation say:-
https://docs.ceph.com/en/reef/start/hardware-recommendations
DB/WAL (optional) 1x SSD partion per HDD OSD 4-5x HDD OSDs per DB/WAL SATA SSD <= 10 HDD OSDss per DB/WAL NVMe SSD What does this mean? I am sorry to say, but it look a tad ambiguous to me, but suspect its obvious once one have experience.
You should not put the DBs of more than 10 HDD OSDs on one SSD, a bit lower is better, but it really depends on the actual workload etc. In your case with 6 OSDs in total per node, you can safely put all 6 DB devices on one SSD.
I have 6 8TB disk per system, and I want to use replication so will end up with 18 OSD
This sounds like you're planning to have 3 nodes in total (replication size 3, good). But note that in case a node is in maintenance and one more node goes down, you'll have a service interruption since monitor quorum won't be possible until at least one more node comes back. Or consider this: you lose one entire node (hardware failure), then your PGs can't recover anywhere and stay degraded until a third node is available again.
We've been running our own Ceph cluster for many years with this exact setup, three nodes in total, and we never had any issues. But I just want to raise awareness because many (new to Ceph) operators aren't really considering these possibilities.
I am hoping I don't need 18 SSD as we don't even have enough bays
No, you definitely don't need that many SSDs.
If we can add in 2 800GB SSD per hardware, how do we optimally map those 18 DW/WAL to a total of 6 SSD disks?
As I wrote above, one SSD per node should be sufficient to put 6 DBs on it.
Regards, William