Hi Chris, Following your email, I think it makes sense to mention the activities we have been doing initially within the FEMDC SiG [1] and more recently under the Edge Computing WG managed by the OSF., To make a long story short, we [2] tried both approaches you mentionned. Regarding the shared database first, we investigated and compared the performance of Galera and CockroachDB[3]. You can refer to [4][5] for futher details. Regarding the second approach you talked about, It looks rather similar to the OID prototype we introduced during the Berlin Hackathon/summit [6] and we will be more than happy to get your feedback on it. Briefly, our mechanism is larger than delivering collaborations/federations between mutiple instances of keystone, larger in the sense that it can be applied to any openstack service. Considering several OpenStacks running, the idea is to use a DSL to specify which instance of the service (i.e. on which Openstack) the user wants its request to be executed. For instance, to provision a VM on a Nova in Paris using an image from a Glance in New York. You may contact the API either in NewYork or in Paris and perform the following request: "openstack server create --image debian --scope { compute: Paris, image: New York }" A proof of concept is available on GitLab[7]. Obviously, our approach needs to be challenged and additional actions should be done to deliver a production ready system. For instance, we should define a clear semantic for the DSL. This is critical as we are extending the DSL with operators such as AND and OR. Using operators will enable Admin/DevOps to perform requests such as : list all VMs running in Paris, NewYork etc : "openstack server list --scope{compute: Paris & Newyork & Boston}" An academic paper presenting the concept overall has been submitted to the HotEdge topic. We don't know yet whether the paper will be accepted or not (due to the COVID-19 everything is unfortunately delayed). However, we hope to share it ASAP. Feedbacks/questions/comments welcome :-) Have a nice day. [1]https://wiki.openstack.org/wiki/Fog_Edge_Massively_Distributed_Clouds [2] https://beyondtheclouds.github.io/ [3] https://www.cockroachlabs.com/ [4] https://beyondtheclouds.github.io/blog/openstack/cockroachdb/2018/06/04/eval... This article was never entirely finished because we lack time, but you can find every results we had there. Note also that CockroachDB team did some amazing work to get better perfomance since, so the results would probably be better now. [5] https://www.openstack.org/videos/summits/vancouver-2018/keystone-in-the-cont... [6] https://www.openstack.org/summit/denver-2019/summit-schedule/events/23352/im... [7] https://gitlab.inria.fr/discovery/openstackoid Marie Delavergne ----- Mail original -----
De: "Krzysztof Klimonda" <kklimonda@syntaxhighlighted.com> À: "openstack-discuss" <openstack-discuss@lists.openstack.org> Envoyé: Mardi 24 Mars 2020 14:45:42 Objet: [keystone][horizon] Two approaches to "multi-region" OpenStack deployment
Hi,
I’ve been spending some time recently thinking about best approach to some sort of multi-region deployment of OpenStack clouds.
The two approaches that I’m currently evaluating are shared keystone database (with galera cluster spanning three locations) and shared-nothing approach, where external component is responsible for managing users, projects etc.
Shared keystone database seems fairly straightforward from OS point of view (I’m ignoring galera replication over WAN woes for the purpose of this discussion) until I hit 3 regions. Additional regions must either reuse “global” keystone adding latency everywhere, or we need a way to replicate data from “master” galera cluster to “slave” clusters, and route all database write statements back to the master galera cluster, while reading from local asynchronous replica.
This has me worried somewhat, as doing that that into eventually-consistent deployment of sort. Services deployed in regions with asynchronous replication can no longer depend on the fact that once transaction is finished, consecutive reads will return up-to-date state. I can imagine scenarios where, as an example, trust is setup for heat, but that fact is not replicated back to the database by the time heat tries to issue a token based on that trust and the process fails.
The other approach would be to keep keystone databases completely separate, and have something external to openstack manage all those resources.
While not sharing keystone database between regions sidesteps the issue of scalability, and the entire setup seems to be more resilient to failures, it’s not without its own drawbacks:
* In this setup Horizon can no longer switch between regions without additional work (see notes below) * There is no longer single API entrypoint to the cloud * Some Keystone API operations would have to be removed from users via custom policy - for example, managing user assignment to projects (for users who have domain admin role)
Additional thoughts
Could horizon be modified to switch endpoints based on the region selected in the UI? Is the token reissued when region is changed in horizon, or is single token used? I’m assuming it’s the former given my understanding that when projects are changed, a new token is issued - but perhaps the initial token is always used to issue project-scoped tokens for Horizon?
In the second scenario, with separate keystone databases, a backend for keystone could be created that proxies some operations (like aforementioned user assignment) back to the external manager so that it can be propagated to other clouds. Does that even make sense?
In the end I’m reaching out in hope that someone could chime in based on their experience - perhaps I’m missing a better approach, or making wrong assumptions in my email, especially around asynchronous replication of keystone database and its effect on services in regions that may not have up-to-data view of the databas. Or perhaps trying ot synchronize keystone state by external tool is not really worth the additional effort that would require.
-Chris