[barbican] database is growing and can not be purged
Hi all, we are observing the following behavior in Barbican: - OpenStack environment is using both encrypted Cinder volumes and encrypted local storage (lvm) for Nova instances - over the time, the secrets and orders tables are growing - many soft-deleted entries in secrets DB can not be purged by the db cleanup script As I understand what is happening - both Cinder and Nova create secrets in Barbican on behalf of the user when creating an encrypted volume or booting an instance with encrypted local storage. They both do it via castellan library, that under the hood creates orders in Barbican, waits for them to become active and returns to the caller only the ID of the generated secret. When time comes to delete the thing (volume or instance) Cinder/Nova again use castellan, but only delete the secret, not the order (they are not aware that there was any 'order' created anyway). As a result, the orders are left in place, and DB cleanup procedure does not delete soft-deleted secrets when there's an ACTIVE order referencing such secret. This is troublesomes on many levels - users who use Cinder or Nova may not even be aware that they are creating something in Barbican. Orders accumulating like that may eventually result in cryptic errors when e.g. when you run out of quota for orders. And what's more, default Barbican policies do allow 'normal' (creator) users to create an order, but not delete it (only project admin can do it), so even if the users are aware of Barbican involvement, they can not delete those orders manually anyway. Plus there's no good way in API to determine outright which orders are referencing deleted secrets. I see several ways of dealing with that and would like to ask for your opinion on what would be the best one: 1. Amend Barbican API to allow filtering orders by the secrets, when castellan deletes a secret - search for corresponding order and delete it as well, change default policy to actually allow order deletion by the same users who can create them. 2. Cascade-delete orders when deleting secrets - this is easy but probably violates that very policy that disallowed normal users to delete orders. 3. improve the database cleanup so it first marks any order that references a deleted secret also as deleted, so later when time comes both could be purged (or something like that). This also has a similar downside to the previous option by not being explicit enough. I've filed a bug for that https://storyboard.openstack.org/#!/story/2010625 and proposed a patch for option 2 (cascade delete), but would like to ask what would you see as the most appropriate way or may be there's something else that I've missed. Btw, the problem is probably even more pronounced with keypairs - when castellan is used to create those, under the hood both order and container are created besides the actual secrets, and again only the secret ids are returned to the caller. When time comes to delete things, the caller only knows about secret IDs, and can only delete them, leaving both container and order behind. Luckily, I did not find any place across OpenStack that actually creates keypairs using castellan... but the problem is definitely there. Best regards, -- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com
Hi all, after having some thoughts, I came to another solution, that I think is the most appropriate here, kind of a variation of option 1: 4. Castellan should cleanup intermediate resources before returning secret ID(s) to the caller As I see it now, the root of the problem is in castellan's BarbicanKeyManager and the way it hides implementation details from the user. Since it returns only IDs of created secrets to the user, the api caller has no notion that something else has to be deleted once it is time for this. Since Barbican API is perfectly capable to delete orders and containers without deleting the secrets they reference, this is what castellan should do just before it returns IDs of generated secrets to the API caller. The only small trouble is that with default 'legacy' API policies in Barbican, not everybody who can create orders can delete them.. but this can be accounted for with try..except. Please review the patch in this regard https://review.opendev.org/c/openstack/castellan/+/877423 Best regards, On Mon, Mar 6, 2023 at 7:32 PM Pavlo Shchelokovskyy < pshchelokovskyy@mirantis.com> wrote:
Hi all,
we are observing the following behavior in Barbican: - OpenStack environment is using both encrypted Cinder volumes and encrypted local storage (lvm) for Nova instances - over the time, the secrets and orders tables are growing - many soft-deleted entries in secrets DB can not be purged by the db cleanup script
As I understand what is happening - both Cinder and Nova create secrets in Barbican on behalf of the user when creating an encrypted volume or booting an instance with encrypted local storage. They both do it via castellan library, that under the hood creates orders in Barbican, waits for them to become active and returns to the caller only the ID of the generated secret. When time comes to delete the thing (volume or instance) Cinder/Nova again use castellan, but only delete the secret, not the order (they are not aware that there was any 'order' created anyway). As a result, the orders are left in place, and DB cleanup procedure does not delete soft-deleted secrets when there's an ACTIVE order referencing such secret.
This is troublesomes on many levels - users who use Cinder or Nova may not even be aware that they are creating something in Barbican. Orders accumulating like that may eventually result in cryptic errors when e.g. when you run out of quota for orders. And what's more, default Barbican policies do allow 'normal' (creator) users to create an order, but not delete it (only project admin can do it), so even if the users are aware of Barbican involvement, they can not delete those orders manually anyway. Plus there's no good way in API to determine outright which orders are referencing deleted secrets.
I see several ways of dealing with that and would like to ask for your opinion on what would be the best one: 1. Amend Barbican API to allow filtering orders by the secrets, when castellan deletes a secret - search for corresponding order and delete it as well, change default policy to actually allow order deletion by the same users who can create them. 2. Cascade-delete orders when deleting secrets - this is easy but probably violates that very policy that disallowed normal users to delete orders. 3. improve the database cleanup so it first marks any order that references a deleted secret also as deleted, so later when time comes both could be purged (or something like that). This also has a similar downside to the previous option by not being explicit enough.
I've filed a bug for that https://storyboard.openstack.org/#!/story/2010625 and proposed a patch for option 2 (cascade delete), but would like to ask what would you see as the most appropriate way or may be there's something else that I've missed.
Btw, the problem is probably even more pronounced with keypairs - when castellan is used to create those, under the hood both order and container are created besides the actual secrets, and again only the secret ids are returned to the caller. When time comes to delete things, the caller only knows about secret IDs, and can only delete them, leaving both container and order behind. Luckily, I did not find any place across OpenStack that actually creates keypairs using castellan... but the problem is definitely there.
Best regards, -- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com
-- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 3/21/23 08:58, Pavlo Shchelokovskyy wrote:
Hi all,
after having some thoughts, I came to another solution, that I think is the most appropriate here, kind of a variation of option 1:
4. Castellan should cleanup intermediate resources before returning secret ID(s) to the caller
Hi Pavlo, We discussed this issue during last week's PTG sessions [1], and we agree that this approach makes sense from a Castellan point of view.
As I see it now, the root of the problem is in castellan's BarbicanKeyManager and the way it hides implementation details from the user. Since it returns only IDs of created secrets to the user, the api caller has no notion that something else has to be deleted once it is time for this. Since Barbican API is perfectly capable to delete orders and containers without deleting the secrets they reference, this is what castellan should do just before it returns IDs of generated secrets to the API caller. The only small trouble is that with default 'legacy' API policies in Barbican, not everybody who can create orders can delete them.. but this can be accounted for with try..except.
I think it would make sense to update the legacy policies to allow users with the "creator" role to delete orders. This change is similar to a change we made to the Secrets policy to allow deletion by users with the "creator" role as well. [2]
Please review the patch in this regard https://review.opendev.org/c/openstack/castellan/+/877423 <https://review.opendev.org/c/openstack/castellan/+/877423>
Thanks for the patch, I've added it to my review queue. Additionally, we discussed some changes we'll make to the API this cycle to hopefully make it easier to manage orders: * Add a new API Mircoversion so we can * Add a new "genrated_by" field to the Secret and Container metadata that contains the order ID for secrets/containers that were created by an order. This would be null for secrets not created by an Order. * Cascade delete the Order when the secret or container is deleted. We'll also be looking at the barbican-manage CLI to make sure that purging deleted secrets is working as expected. Regards, - - Douglas Mendizábal [1] https://etherpad.opendev.org/p/march2023-ptg-barbican [2] https://storyboard.openstack.org/#!/story/2009791 -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEwcapj5oGTj2zd3XogB6WFOq/OrcFAmQq1FoACgkQgB6WFOq/ OrcZ3A/+LIFg19DbIPbDBXkCe2I500epuaVXK9TKoslZnKfc1cBlFA/KXJ6oT/Jt QBi+5BtmwKr/EF7hS2zCW1s0wuq0IcFyqeXF3Ucy+++U124prNSMlyrpizamWxnq nrz8YVpSQ268zgDgyzz95NCMUXqfzmkr/yEEZmLsrY/WKU3wIRJt8A/OOrtyhLcd YYjAwGjXfEVPM6DoBNcR3tbaHEEf46CpbRumx4zcRGDiydrOjODX/Mm4DyL94gR9 qSfO5F9MvZLT+ntim/grTQ/lNMye5uKiCHEexDJSgP/hoHcPiUQo8SpQyRMcYB/e Fxx10PactyiXumctuNBFuIT3rWiimsRzf5JIW9EEIoGjZEeT43dhs9C1sfRuuqHP z6E6A/8/FpZlJny0QOCr4/mQvSQiFZI7dZMKrqZZOZ1viF6w0AAnYBR4E57/pmLp pXTfRwFE2FeI6xPgpoxz4R28ky5BakyQSC5DSoy8gcVLfkKJPjbe27nrTgERGE+Y S1/NDan3DJBcAkyBdD1NZaWw3Yvrx9EIc9H07genb+9bwdSY2/zc5VzP6dSyt78i M9BFH1xMJxigKHB02YkonxbY895v1uHjaE7yJU+AhAjJ+Ep0Gy59Cwsg71AAmvtC 0AL4c56C4uWYjVTcigu3PdTdaFNXlM21yCZwuDLUwHPt1vF1eZA= =ttcw -----END PGP SIGNATURE-----
participants (2)
-
Douglas Mendizabal
-
Pavlo Shchelokovskyy