---- On Tue, 22 Jan 2019 10:14:50 +0900 Adrian Turjak <adriant@catalyst.net.nz> wrote ----
I've expanded on the notes in the etherpad about why Keystone isn't the actor.
At the summit we discussed this option, and all the people familiar with Keystone who were in the room (or in some later discussions), agreed that making Keystone the actor is a BAD idea.
Keystone does not currently do any orchestration or workflow of this nature, making it do that adds a lot of extra logic which it just shouldn't need. After a project delete it would need to call all the APIs, and then confirm they succeeded, and maybe retry. This would have to be done asynchronously since waiting and confirming the deletion would take longer than a single API call to delete a project in Keystone should take. That kind of logic doesn't fit in Keystone. Not to mention there are issues on how Keystone would know which services support such an API, and where exactly it might be (although catalog + consistent API placement or discovery could solve that).
Essentially, going down the route of "make this Keystone's problem" is in my opinion a hard NO, but I'll let the Keystone devs weigh in on that before we make that a very firm hard NO.
As for solutions. Ideally we do implement the APIs per service (that's the end goal), but we ALSO make libraries that do deletion of resource using the existing APIs. If the library sees that a service version is one with the purge API it uses it, otherwise it has a fallback for less efficient deletion. This has the major benefit of working for all existing deployments, and ones stuck on older OpenStack versions. This is a universal problem and we need to solve it backwards AND forwards.
By doing both (with a first step focus on the libraries) we can actually give projects more time to build the purge API, and maybe have the API portion of the goal extend into another cycle if needed.
Essentially, we'd make a purge library that uses the SDK to delete resources. If a service has a purge endpoint, then the library (via the SDK) uses that. The specifics of how the library purges, or if the library will be split into multiple libraries (one top level, and then one per service) is to be decided.
A rough look at what a deletion process might looks like: 1. Disable project in Keystone (so no new resources can be created or modified), or clear all role assignments (and api-keys) from project. 2. Purge platform orchestration services (Magnum, Sahara 3. Purge Heat (Heat after Magnum, because magnum and such use Heat, and deleting Heat stacks without deleting the 'resource' which uses that stack can leave a mess) 4. Purge everything left (order to be decided or potentially dynamically chosen). 5. Delete or Disable Keystone project (disable is enough really).
One important thing we need to discuss is about rollback. If any service or some services not able to delete their resources then, what Purge library should do ? error and rollback? success with non-deleted resources left behind ? error with saying list of non-deleted resources and hold the project deletion till then ? or It can be multiple run deletion but keep the project in disable state until all resources are gone. Because this library is going to provide the functionality of cleanup everything. Half cleaned project deletion can be another issue. IMO project can be in disable state until user able to delete all the resource from the library we provide. -gmann
The actor is then first a CLI built into the purge library as a OSClient command, then secondly maybe an API or two in Adjutant which will use this library. Or anyone can use the library and make anything they want an actor.
Ideally if we can even make the library allow selectively choosing which services to purge (conditional on dependency chain), that could be useful for cases where a user wants to delete everything except maybe what's in Swift or Cinder.
This is in many ways a HUGE goal, but one that we really need to accomplish. We've lived with this problem too long and the longer we leave it unsolved, the harder it becomes.
On 22/01/19 9:30 AM, Lance Bragstad wrote:
On Mon, Jan 21, 2019 at 2:18 PM Ed Leafe <ed@leafe.com> wrote: On Jan 21, 2019, at 1:55 PM, Lance Bragstad <lbragstad@gmail.com> wrote: > > Are you referring to the system scope approach detailed on line 38, here [0]?
Yes.
> I might be misunderstanding something, but I didn't think keystone was going to iterate all available services and call clean-up APIs. I think it was just that services would be able to expose an endpoint that cleans up resources without a project scoped token (e.g., it would be system scoped [1]). > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > [1] https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system...
It is more likely that I’m misunderstanding. Reading that etherpad, it appeared that it was indeed the goal to have project deletion in Keystone cascade to all the services, but I guess I missed line 19.
So if it isn’t Keystone calling this API on all the services, what would be the appropriate actor?
The actor could still be something like os-purge or adjutant [0]. Depending on how the implementation shakes out in each service, the implementation in the actor could be an interation of all services calling the same API for each one. I guess the benefit is that the actor doesn't need to manage the deletion order based on the dependencies of the resources (internal or external to a service). Adrian, and others, have given this a bunch more thought than I have. So I'm curious to hear if what I'm saying is in line with how they've envisioned things. I'm recalling most of this from Berlin. [0] https://adjutant.readthedocs.io/en/latest/
-- Ed Leafe