[Cinder][Ceph] Volumetria errada no cluster ceph
Hello, I'm experiencing an issue with the volume usage of the Ceph cluster, which is currently used in OpenStack for volumes. I am working with Ceph version Octopus (15.2.17). When I run the ceph df command, I get this output: --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 120 TiB 33 TiB 87 TiB 87 TiB 72.72 TOTAL 120 TiB 33 TiB 87 TiB 87 TiB 72.72 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 32 69 MiB 101 137 MiB 0 11 TiB images 2 256 1.3 TiB 166.90k 2.5 TiB 10.72 11 TiB vms 3 64 574 KiB 21 2.5 MiB 0 11 TiB volumes 4 2048 41 TiB 3.94M 82 TiB 79.42 11 TiB backups 5 1024 407 GiB 111.85k 818 GiB 3.63 11 TiB And when I run the rbd du -p volumes command, I get this total: NAME PROVISIONED USED <TOTAL> 14 TiB 8.3 TiB This "volumes" pool is currently set to replica 2, and mirroring is not enabled. I have checked for any locked or deleted snapshots but found none. I also ran a ceph osd pool deep-scrub volumes, but it didn't resolve the issue. Has anyone encountered this problem before? Could someone provide assistance?
Hi, did you look at the trash? rbd -p volumes trash ls Zitat von lfsilva@binario.cloud:
Hello,
I'm experiencing an issue with the volume usage of the Ceph cluster, which is currently used in OpenStack for volumes. I am working with Ceph version Octopus (15.2.17).
When I run the ceph df command, I get this output: --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 120 TiB 33 TiB 87 TiB 87 TiB 72.72 TOTAL 120 TiB 33 TiB 87 TiB 87 TiB 72.72
--- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 32 69 MiB 101 137 MiB 0 11 TiB images 2 256 1.3 TiB 166.90k 2.5 TiB 10.72 11 TiB vms 3 64 574 KiB 21 2.5 MiB 0 11 TiB volumes 4 2048 41 TiB 3.94M 82 TiB 79.42 11 TiB backups 5 1024 407 GiB 111.85k 818 GiB 3.63 11 TiB
And when I run the rbd du -p volumes command, I get this total: NAME PROVISIONED USED <TOTAL> 14 TiB 8.3 TiB
This "volumes" pool is currently set to replica 2, and mirroring is not enabled. I have checked for any locked or deleted snapshots but found none. I also ran a ceph osd pool deep-scrub volumes, but it didn't resolve the issue.
Has anyone encountered this problem before? Could someone provide assistance?
Hello Eugen, Yes, I ran the command but there is no output: "root@hc-node01:~# rbd -p volumes trash ls root@hc-node01:~# "
Did you bulk delete a lot of volumes? Are you sure there aren't any snapshots involved? The math for the 'rbd du' output is plausible: NAME PROVISIONED USED <TOTAL> 14 TiB 8.3 TiB 3.94 Million objects of size 4M results in roughly 14 or 15 TB, so it doesn't seem like there are orphaned objects. Can you share 'ceph osd df', 'ceph -s' and 'ceph pg ls-by-pool volumes' in a text file? BTW, replica size 2 is a really bad choice, it has been discussed many times why it should only be considered in test clusters or if your data is not important. Zitat von lfsilva@binario.cloud:
Hello Eugen,
Yes, I ran the command but there is no output:
"root@hc-node01:~# rbd -p volumes trash ls root@hc-node01:~# "
Here are links to the files with the output of the requested commands: https://drive.google.com/file/d/1GTXKlBDp6QTJWmNZxoycflqHeAPvadjy/view?usp=s..., https://drive.google.com/file/d/1TgtNW3btQSPrOI3vdNwLFpWGmyPYs0Ou/view?usp=s..., https://drive.google.com/file/d/1qG0Gro00rFEMvteUgG78sWMNZrakRL3_/view?usp=s... Thank you very much, we will adjust to replica 3.
I don't have access to your google drive, please use some accessible platform or make those files publicly available, or attach them as a plain text file, although I'm not sure if attachments are allowed here... Zitat von lfsilva@binario.cloud:
Here are links to the files with the output of the requested commands: https://drive.google.com/file/d/1GTXKlBDp6QTJWmNZxoycflqHeAPvadjy/view?usp=s..., https://drive.google.com/file/d/1TgtNW3btQSPrOI3vdNwLFpWGmyPYs0Ou/view?usp=s..., https://drive.google.com/file/d/1qG0Gro00rFEMvteUgG78sWMNZrakRL3_/view?usp=s...
Thank you very much, we will adjust to replica 3.
On 2024-10-29 17:23:19 +0000 (+0000), Eugen Block wrote:
I don't have access to your google drive, please use some accessible platform or make those files publicly available, or attach them as a plain text file, although I'm not sure if attachments are allowed here... [...]
Most reasonable kinds of attachments (including like plain text files) are allowed, but if they push the message size over 40KB then the post will be held until a list moderator has a chance to look it over. I personally try to process the moderator holds for this list once a day unless I'm really busy with other things. -- Jeremy Stanley
Hello Eugen, Please try to access those links again, I've changed the permissions.
Okay, it's accessible now. From a first glance (I won't have much time over the next days) the numbers seem to match: One PG has a size of around 22 GB, having 2048 PGs results in roughly 42 TB, which matches your 'ceph df' output. I don't exactly recall if Ceph Octopus displayed usage information differently than newer releases, but I think it should match in general, especially with rbd. But there might be tombstones in the rocksDB of the OSDs, have you tried compaction? Offline compaction is usually better than online compaction, so you might want to stop one OSD by one and use the ceph-kvstore-tool for that. But I don't expect it to free up that much space. You could also try to 'rbd sparsify' a couple of images and see if you get some space back. Zitat von lfsilva@binario.cloud:
Hello Eugen,
Please try to access those links again, I've changed the permissions.
participants (3)
-
Eugen Block
-
Jeremy Stanley
-
lfsilva@binario.cloud