[CINDER] - RBD backend reporting
Hey list. Currently when using an RBD backend using the default settings, total_capacity_gb is reported as MAX_AVAIL + USED bytes, converted into GiB. This to mean seems a little odd, as I would expect that total_capacity_gb should report the total size in GiB of the backend cluster. This of course can be fixed by adding "report_dynamic_total_capacity = false" into the cinder.conf section for the rbd backend. This works fine for ceph clusters where all pools consume from a single disk type/root. But in clusters where you have multiple root's or device types it does not work correctly. Is this proving to be a pain point for anyone else, or is it just me, and if it is proving a problem for others, im happy to write a patch. Im thinking something that gets the pools crushrule and works out the total_capacity_gb, based off the total available capacity of a pool based on its root/crushrules. Rgds Steve. The future has already arrived. It's just not evenly distributed yet - William Gibson
On 12/05, Steven Relf wrote:
Hey list. Currently when using an RBD backend using the default settings, total_capacity_gb is reported as MAX_AVAIL + USED bytes, converted into GiB. This to mean seems a little odd, as I would expect that total_capacity_gb should report the total size in GiB of the backend cluster. This of course can be fixed by adding "report_dynamic_total_capacity = false" into the cinder.conf section for the rbd backend.
Hi Steve, That's the problem of having to keep backward compatibility, that even if the driver is doing something non-standard and it's inconveniencing some users [1][2], we cannot just change the behavior, as it could make trouble for another group of users who are currently relying on the current behavior. That's why I had to set the default to true (keep old behavior) in the fix. [1]: https://bugs.launchpad.net/cinder/+bug/1712549 [2]: https://bugs.launchpad.net/cinder/+bug/1706057
This works fine for ceph clusters where all pools consume from a single disk type/root. But in clusters where you have multiple root's or device types it does not work correctly.
I don't currently have a system like that to check it, but I would assume current code works as intended: It gets the stats and quota for the pool and uses the most limiting value of the two. As far as I know the stats should be returning the aggregate of the different disks that form the pool. I would like to better understand the difference between what is being reported and what is expected in your environment. Could you share the output of the following commands?: $ ceph -f json-pretty df $ ceph -f json-pretty osd pool get-quota <POOL-NAME> Also what values is Cinder reporting, and what values it should be reporting? Thanks.
Is this proving to be a pain point for anyone else, or is it just me, and if it is proving a problem for others, im happy to write a patch.
I haven't heard anyone having that problem before, though that doesn't mean there are not people suffering it as well. Cheers, Gorka.
Im thinking something that gets the pools crushrule and works out the total_capacity_gb, based off the total available capacity of a pool based on its root/crushrules. Rgds Steve.
The future has already arrived. It's just not evenly distributed yet - William Gibson
participants (2)
-
Gorka Eguileor
-
Steven Relf