[oslo][tooz][openstack-ansible] Discussion about coordination (tooz), too many backend options, their state and deployment implications
Hallo openstack-discuss, apologies for this being quite a long message - I tried my best to collect my thoughts on the matter. 1) The role of deployment tooling in fulfilling the requirement for a coordination backend I honestly write this, triggered by openstack-ansible plans to add coordination via the Zookeeper backend (https://lists.openstack.org/pipermail/openstack-discuss/2022-October/031013....). On 27/10/2022 13:10, Dmitriy Rabotyagov wrote:
* Add Zookepeer cluster deployment as coordination service. Coordination is required if you want to have active/active cinder-volume setup and also actively used by other projects, like Octavia or Designate. Zookeeper will be deployed in a separate set of containers for LXC path
First of all I believe it's essential for any OpenStack deployment tooling to handle the deployment of a coordination backend as many OS projects just rely in their design and code to have it in place. But I am convinced though there too many options, that some stronger guidance should be given to people designing and then deploying OS for their platform. This guidance certainly can be in the form of a comparison table - but when it comes to using deployment tooling like openstack-ansible, the provided "default" component or options for something might just be worth more than written text explaining all of the possible approaches. This hold especially true to me as you can get quite far with no coordination configured which then results in frustration and invalid bugs being raised. And it's not just openstack-ansible thinking about coordination deployment / configurations. To just point to a few: * Kolla-Ansible: https://lists.openstack.org/pipermail/openstack-discuss/2020-November/018838... * Charms: https://bugs.launchpad.net/charm-designate/+bug/1759597 * Puppet: https://review.opendev.org/c/openstack/puppet-oslo/+/791628/ * ... 2) Choosing the "right" backend driver I've recently been looking into the question what would be the "best" tooz driver to cover all coordination use cases the various OS projects require. Yes, the dependencies and use of coordination within the OS projects (cinder, designate, gnocchi, ...) are very different. I don't want to sound unfair, but most don't communicate which of the Tooz services they actually require. In any case, I might just like something to cover all possible requirements to "set it (up) and forget it, no matter what OS projects run on the platform. Apart from basic compatibility, there are qualities I would expect (in no particular order) from a coordination backend: * no "best-effort" coordination, but allowing for actual reliance on it (CP if talking CAP) * HA - this needs to be working just as reliably as my database as otherwise the cloud cannot function * efficient in getting the job done (e.g. support for events / watches to reduce latency) * lightweight (deployment), no additional components, readily packaged * very little maintenance operations * easy monitoring I started by reading into the tooz drivers (https://docs.openstack.org/tooz/latest/user/drivers.html), of which there are more than enough to require some research. Here are my rough thoughts: a) I ruled out the IPC, file or RDBMs (mysql, postgresql) backend options as they all have strong side-notes (issues when doing replication or no HA at all). Additionally they usually are not partition tolerant or support watches. b) Redis seems quite capable, but there are many side notes about HA and this also requires setting up and maintaining sentinel. c) Memcached supports all three services (locks, groups, leader-election) tooz provides and is usually already part of an OpenStack infrastructure. So looked appealing. But it's non-replicating architecture and lack of any strong consistency guarantees make it less of a good "standard". I was even wondering how tooz would try it's best to work with multiple memcached nodes (https://bugs.launchpad.net/python-tooz/+bug/1970659). d) Then there only is Zookeeper left, which also ticks all the (feature-)boxes (https://docs.openstack.org/tooz/latest/user/compatibility.html) and is quite a proven tool for coordination also outside of the OpenStack ecosystem. On the downside it's not really that well known and common (anymore) outside the "data processing" context (see https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy). Being a Java application it requires a JVM and its dependencies and is quite memory heavy to store just a few megabytes of config data. Looking at more and more people putting their OS control plane into something like Kubernetes it also seems even less suitable to be "moved around" a lot. Another issue might be the lack of a recent and non-EoL version packaged in Ubuntu - see https://bugs.launchpad.net/ubuntu/+source/zookeeper/+bug/1854331. Maybe (!) this could be an indication of how commonly it is used outside of e.g. Support from TLS was only added in 3.5.5 (https://zookeeper.apache.org/doc/r3.5.5/zookeeperAdmin.html#Quorum+TLS) e) Consul - While also well known and loved, it has, like Zookeeper, quite a big footprint and is way more than just a CP-focused database. It's more of an application with man use cases. f) A last "strong" candidate is etcd. It did not surprise me to see it on the list of possible drivers and certainly is a tool known to many from running e.g. Kubernetes. It's actually already part of openstack-ansible deployment code as a role (https://github.com/openstack/openstack-ansible/commit/2f240dd485b123763442aa...) as it is required when using Calico as SDN. While etcd is also something one must know how to monitor and operate, I allow me to say it might just be more common to find this operational knowledge. Also etcd has a smaller footprint than Zookeeper and it beeing "just a Golang binary" comes with (no) less dependencies. But I noticed that it does not even support "grouping", according to the feature matrix. But apparently this is just a documentation delay, seehttps://bugs.launchpad.net/python-tooz/+bug/1995125. What's left to implement would be leader-election, but there seems to be no technical reason why this cannot be done. this by no means is a comparison with a clear winner. I just want to stress how confusing having lots of options with no real guidance are. The requirement to chose and deploy coordination might not be a focus when looking into designing an OS cloud. 3) Stronger guidance / "common default", setup via OS deployment tooling and also used for DevStack and tested via CI To summarize, there are just too many options and implications in the compatibility list to quickly chose the "right" one for one's own deployment. While large-scale deployments might likely not mind for coordination to have a bigger footprint and requiring more attention in general. But for smaller and even mid-size deployments, it's just convenient to offload the configuration of coordination and the selection the backend driver to the deployment tooling. Making it way too easy for such installations to not use coordination and running into issues or every other installation using a different backend creates a very fragmented landscape. Add different operating system distributions and versions, different deployment tooling, different set and versions of OS projects used, there will be so many combinations. This will likely just cause OS projects to receive more and non-reproducible bugs. Also not having (a somewhat common) coordination (backend) used within CI and DevStack does not expose the relevant code paths to enough testing. I'd like to make the analogy to having "just" MySQL as the default database engine, while still allowing other engines to be used (https://governance.openstack.org/tc/resolutions/20170613-postgresql-status.h...). Or labeling certain options as "experimental" as Neutron just did with "linuxbridge" (https://docs.openstack.org/neutron/latest//admin/config-experimental-framewo...) or cinder with naming drivers unsupported (https://docs.openstack.org/cinder/ussuri/drivers-all-about.html#unsupported-...). My point is that just having all those backends and no active guidance might make Tooz a very open and flexible component. I myself would wish for some less confusion around this topic and having a little less to think about this myself. Maybe the "selection" of Zookeeper by openstack-ansible is just that? I would love to hear your thoughts on coordination and why and how you ended up with using what. And certainly what your opinion on the matter of a stronger communicated "default" is. Thanks for your time and thoughts! Christian
Hello, Interesting topic, we use Redis because frankly we see that as the most logical choice due to the complexity of others. You might have seen my thread about investigating replacing RabbitMQ with NATS; our plan is to then also investigate getting Tooz and oslo.cache using the Jetstream Key-Value feature. Best regards Tobias
On 31 Oct 2022, at 12:29, Christian Rohmann <christian.rohmann@inovex.de> wrote:
Hallo openstack-discuss,
apologies for this being quite a long message - I tried my best to collect my thoughts on the matter.
1) The role of deployment tooling in fulfilling the requirement for a coordination backend
I honestly write this, triggered by openstack-ansible plans to add coordination via the Zookeeper backend (https://lists.openstack.org/pipermail/openstack-discuss/2022-October/031013....).
On 27/10/2022 13:10, Dmitriy Rabotyagov wrote:
* Add Zookepeer cluster deployment as coordination service. Coordination is required if you want to have active/active cinder-volume setup and also actively used by other projects, like Octavia or Designate. Zookeeper will be deployed in a separate set of containers for LXC path
First of all I believe it's essential for any OpenStack deployment tooling to handle the deployment of a coordination backend as many OS projects just rely in their design and code to have it in place. But I am convinced though there too many options, that some stronger guidance should be given to people designing and then deploying OS for their platform. This guidance certainly can be in the form of a comparison table - but when it comes to using deployment tooling like openstack-ansible, the provided "default" component or options for something might just be worth more than written text explaining all of the possible approaches.
This hold especially true to me as you can get quite far with no coordination configured which then results in frustration and invalid bugs being raised. And it's not just openstack-ansible thinking about coordination deployment / configurations. To just point to a few:
* Kolla-Ansible: https://lists.openstack.org/pipermail/openstack-discuss/2020-November/018838... * Charms: https://bugs.launchpad.net/charm-designate/+bug/1759597 * Puppet: https://review.opendev.org/c/openstack/puppet-oslo/+/791628/ * ...
2) Choosing the "right" backend driver
I've recently been looking into the question what would be the "best" tooz driver to cover all coordination use cases the various OS projects require. Yes, the dependencies and use of coordination within the OS projects (cinder, designate, gnocchi, ...) are very different. I don't want to sound unfair, but most don't communicate which of the Tooz services they actually require. In any case, I might just like something to cover all possible requirements to "set it (up) and forget it, no matter what OS projects run on the platform.
Apart from basic compatibility, there are qualities I would expect (in no particular order) from a coordination backend:
* no "best-effort" coordination, but allowing for actual reliance on it (CP if talking CAP) * HA - this needs to be working just as reliably as my database as otherwise the cloud cannot function * efficient in getting the job done (e.g. support for events / watches to reduce latency) * lightweight (deployment), no additional components, readily packaged * very little maintenance operations * easy monitoring
I started by reading into the tooz drivers (https://docs.openstack.org/tooz/latest/user/drivers.html), of which there are more than enough to require some research. Here are my rough thoughts:
a) I ruled out the IPC, file or RDBMs (mysql, postgresql) backend options as they all have strong side-notes (issues when doing replication or no HA at all). Additionally they usually are not partition tolerant or support watches.
b) Redis seems quite capable, but there are many side notes about HA and this also requires setting up and maintaining sentinel.
c) Memcached supports all three services (locks, groups, leader-election) tooz provides and is usually already part of an OpenStack infrastructure. So looked appealing. But it's non-replicating architecture and lack of any strong consistency guarantees make it less of a good "standard". I was even wondering how tooz would try it's best to work with multiple memcached nodes (https://bugs.launchpad.net/python-tooz/+bug/1970659).
d) Then there only is Zookeeper left, which also ticks all the (feature-)boxes (https://docs.openstack.org/tooz/latest/user/compatibility.html) and is quite a proven tool for coordination also outside of the OpenStack ecosystem. On the downside it's not really that well known and common (anymore) outside the "data processing" context (see https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy). Being a Java application it requires a JVM and its dependencies and is quite memory heavy to store just a few megabytes of config data. Looking at more and more people putting their OS control plane into something like Kubernetes it also seems even less suitable to be "moved around" a lot. Another issue might be the lack of a recent and non-EoL version packaged in Ubuntu - see https://bugs.launchpad.net/ubuntu/+source/zookeeper/+bug/1854331. Maybe (!) this could be an indication of how commonly it is used outside of e.g. Support from TLS was only added in 3.5.5 (https://zookeeper.apache.org/doc/r3.5.5/zookeeperAdmin.html#Quorum+TLS)
e) Consul - While also well known and loved, it has, like Zookeeper, quite a big footprint and is way more than just a CP-focused database. It's more of an application with man use cases.
f) A last "strong" candidate is etcd. It did not surprise me to see it on the list of possible drivers and certainly is a tool known to many from running e.g. Kubernetes. It's actually already part of openstack-ansible deployment code as a role (https://github.com/openstack/openstack-ansible/commit/2f240dd485b123763442aa...) as it is required when using Calico as SDN. While etcd is also something one must know how to monitor and operate, I allow me to say it might just be more common to find this operational knowledge. Also etcd has a smaller footprint than Zookeeper and it beeing "just a Golang binary" comes with (no) less dependencies. But I noticed that it does not even support "grouping", according to the feature matrix. But apparently this is just a documentation delay, seehttps://bugs.launchpad.net/python-tooz/+bug/1995125. What's left to implement would be leader-election, but there seems to be no technical reason why this cannot be done.
this by no means is a comparison with a clear winner. I just want to stress how confusing having lots of options with no real guidance are. The requirement to chose and deploy coordination might not be a focus when looking into designing an OS cloud.
3) Stronger guidance / "common default", setup via OS deployment tooling and also used for DevStack and tested via CI
To summarize, there are just too many options and implications in the compatibility list to quickly chose the "right" one for one's own deployment.
While large-scale deployments might likely not mind for coordination to have a bigger footprint and requiring more attention in general. But for smaller and even mid-size deployments, it's just convenient to offload the configuration of coordination and the selection the backend driver to the deployment tooling. Making it way too easy for such installations to not use coordination and running into issues or every other installation using a different backend creates a very fragmented landscape. Add different operating system distributions and versions, different deployment tooling, different set and versions of OS projects used, there will be so many combinations. This will likely just cause OS projects to receive more and non-reproducible bugs. Also not having (a somewhat common) coordination (backend) used within CI and DevStack does not expose the relevant code paths to enough testing.
I'd like to make the analogy to having "just" MySQL as the default database engine, while still allowing other engines to be used (https://governance.openstack.org/tc/resolutions/20170613-postgresql-status.h...). Or labeling certain options as "experimental" as Neutron just did with "linuxbridge" (https://docs.openstack.org/neutron/latest//admin/config-experimental-framewo...) or cinder with naming drivers unsupported (https://docs.openstack.org/cinder/ussuri/drivers-all-about.html#unsupported-...).
My point is that just having all those backends and no active guidance might make Tooz a very open and flexible component. I myself would wish for some less confusion around this topic and having a little less to think about this myself.
Maybe the "selection" of Zookeeper by openstack-ansible is just that?
I would love to hear your thoughts on coordination and why and how you ended up with using what. And certainly what your opinion on the matter of a stronger communicated "default" is.
Thanks for your time and thoughts!
Christian
On 31/10/2022 15:07, Tobias Urdin wrote:
Interesting topic, we use Redis because frankly we see that as the most logical choice due to the complexity of others.
Interesting now one's mileage varies :-)
You might have seen my thread about investigating replacing RabbitMQ with NATS; our plan is to then also investigate getting Tooz and oslo.cache using the Jetstream Key-Value feature.
That sounds really interesting, I shall follow that discussion then. If one tool, e.g. NATS in your case, could cover more than one communication use case, e.g. (async) messaging and distributed locking, this would reduce the number of different components required to assemble a cloud, thus reducing the complexity. Even if there was more than once instance of that software required. As I was also arguing that adding more and more implementations and "ways" to do things, does neither help the operators nor the developers. To me, software developers benefit from clear abstractions for such cross-cutting concerns as messaging or coordination. While e.g. tooz already aims to be such an abstraction, when deploying OpenStack or operating a cloud things can look so vastly different. No coordination at all, different drivers with different features and inherently different guarantees and behavior in case of problems. Discussing broadly, and then agreeing not only on a common library and its interface, but also on an implementation to me is not inflexible, but makes sense to keep the complexity manageable. It happened with MySQL/MariaDB as db engine and actually also with AMQP as messaging protocol (including it's paradigms). It's progress to simply revisit such decisions and conventions over time. Regards Christian
On Mon, Oct 31, 2022, at 4:29 AM, Christian Rohmann wrote:
snip
d) Then there only is Zookeeper left, which also ticks all the (feature-)boxes (https://docs.openstack.org/tooz/latest/user/compatibility.html) and is quite a proven tool for coordination also outside of the OpenStack ecosystem. On the downside it's not really that well known and common (anymore) outside the "data processing" context (see https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy). Being a Java application it requires a JVM and its dependencies and is quite memory heavy to store just a few megabytes of config data. Looking at more and more people putting their OS control plane into something like Kubernetes it also seems even less suitable to be "moved around" a lot. Another issue might be the lack of a recent and non-EoL version packaged in Ubuntu - see https://bugs.launchpad.net/ubuntu/+source/zookeeper/+bug/1854331. Maybe (!) this could be an indication of how commonly it is used outside of e.g. Support from TLS was only added in 3.5.5 (https://zookeeper.apache.org/doc/r3.5.5/zookeeperAdmin.html#Quorum+TLS)
Zuul relies on Zookeeper for its coordination and shared state (without tooz). This is nice because it means we can look at the OpenDev Zuul ZK cluster stats for more info. We currently run a three node cluster. Each node is a 4vcpu 4GB memory VM. The JVM itself seems to consume just under a gig of memory per node. Total system memory stats can be seen here [0]. According to `docker image list` the zookeeper container images we are running are 265MB large. If you scroll to the bottom of this grafana dashboard [1] you'll see operating stats for the cluster. All that to show that zookeeper isn't free, but it also isn't terribly expensive to run either. Particularly when it tends to fill an important role of preventing software from trampling over itself. As far as installing it goes, we've been happily using the official docker images [2]. They have worked well for us and have been kept up to date (including TLS support). If you don't want to use those images the tarballs upstream publishes [3] include init scripts that can be used to manage zookeeper as a proper service. You just download, verify, extract, and execute the script (assuming you have java installed) and the service runs. I'm not going to try and convince anyone that they should use Zookeeper or not. I just want to put concrete details on some of these concerns. [0] http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=70034&rra_id=all [1] https://grafana.opendev.org/d/21a6e53ea4/zuul-status?orgId=1&from=now-7d&to=now [2] https://hub.docker.com/_/zookeeper [3] https://zookeeper.apache.org/releases.html
participants (3)
-
Christian Rohmann
-
Clark Boylan
-
Tobias Urdin