Interesting, I have a kolla-ansible one-node cluster with Antelope and there I see what you describe as well. So the behavior did indeed change. I guess the docs should be updated and contain read-only rbd profile for glance. Zitat von Eugen Block <eblock@nde.ag>:
Hi,
you're right about the permission error when running 'rbd children' as the glance user. However, this didn't come up in our cloud environments yet because glance first checks for existing snapshots, which usually exist and are protected:
2023-12-14 07:54:59.493 2895 WARNING glance_store._drivers.rbd [req-f57f0688-aff9-46b8-90d6-f385438cfb8f aa7c830654a64ce2a0a511a5959c5ca1 17d254c1c283409c94559d588e17b703 - default default] Remove image 3b04672f-5e27-447c-965f-878a4c8b1aa9 failed. It has snapshot(s) left.: rbd.ImageHasSnapshots: [errno 39] RBD image has snapshots (error removing image) The image cannot be deleted because it has snapshot(s).: 409 Conflict Failed to delete 1 of 1 images.
Has this behavior changed in newer releases? Because we still run on Victoria. But if it has changed it sounds reasonable to have readonly permissions for glance, I guess.
Zitat von Christian Rohmann <christian.rohmann@inovex.de>:
Hey openstack-discuss,
I am a little confused about correct and required Ceph authx permissions for the RBD clients in Cinder, Glance and also Nova:
When Glance is requested to delete an image it will check if this image has depended children, see https://opendev.org/openstack/glance_store/src/commit/6f5011d1f05c99894fb8b9.... The children of Glance images usually are (Cinder) volumes, which therefore live in a different RBD pool "volumes". But if such children do exist a 500 error is thrown by Glance API. There also is an bug about this issue on Launchpad [3].
Manually using the RBD client shows the same error:
# rbd -n client.glance -k /etc/ceph/ceph.client.glance.keyring -p images children $IMAGE_ID
2023-12-13T16:51:48.131+0000 7f198cf4e640 -1 librbd::image::OpenRequest: failed to retrieve name: (1) Operation not permitted 2023-12-13T16:51:48.131+0000 7f198d74f640 -1 librbd::ImageState: 0x5639fdd5af60 failed to open image: (1) Operation not permitted rbd: listing children failed: (1) Operation not permitted 2023-12-13T16:51:48.131+0000 7f1990c474c0 -1 librbd::api::Image: list_descendants: failed to open descendant b7078ed7ace50d from pool instances:(1) Operation not permitted
So it's a permission error. Following either the documentation of Glance [1] or Ceph [2] on configuring the ceph auth caps there is no mention of granting anything towards the volume pool to Glance. So this is what I currently have configured:
client.cinder key: REACTED caps: [mgr] profile rbd pool=volumes, profile rbd-read-only pool=images caps: [mon] profile rbd caps: [osd] profile rbd pool=volumes, profile rbd-read-only pool=images
client.glance key: REACTED caps: [mgr] profile rbd pool=images caps: [mon] profile rbd caps: [osd] profile rbd pool=images
client.nova key: REACTED caps: [mgr] profile rbd pool=instances, profile rbd pool=images caps: [mon] profile rbd caps: [osd] profile rbd pool=instances, profile rbd pool=images
When granting the glance client e.g. "rbd-read-only" to the volumes pool via:
# ceph auth caps client.glance mon 'profile rbd' osd 'profile rbd pool=images, profile rbd-read-only pool=volumes' mgr 'profile rbd pool=images, profile rbd-read-only pool=volumes'
the error is gone.
I am wondering through if this is really just a documentation bug (at OpenStack AND Ceph equally) and if Glance really needs read-only on the whole volumes pool or if there is some other capability that covers asking for child images.
All in all I am simply wondering what the correct and least-privilege ceph auth caps for the RBD clients in Cinder, Glance and Nova would look like.
Thanks
Christian
[1] https://docs.openstack.org/glance/latest/configuration/configuring.html#conf... [2] https://docs.ceph.com/en/latest/rbd/rbd-openstack/#setup-ceph-client-authent... [3] https://bugs.launchpad.net/glance/+bug/2045158