[openstack-dev] [tc] Active or passive role with our database layer

Jay Pipes jaypipes at gmail.com
Tue May 23 18:43:37 UTC 2017


On 05/23/2017 07:23 AM, Chris Dent wrote:
> That "higher dev cost" is one of my objections to the 'active'
> approach but it is another implication that worries me more. If we
> limit deployer architecture choices at the persistence layer then it
> seems very likely that we will be tempted to build more and more
> power and control into the persistence layer rather than in the
> so-called "business" layer. In my experience this is a recipe for
> ossification. The persistence layer needs to be dumb and
> replaceable.

Err, in my experience, having a *completely* dumb persistence layer -- 
i.e. one that tries to assuage the differences between, say, relational 
and non-relational stores -- is a recipe for disaster. The developer 
just ends up writing join constructs in that business layer instead of 
using a relational data store the way it is intended to be used. Same 
for aggregate operations. [1]

Now, if what you're referring to is "don't use vendor-specific 
extensions in your persistence layer", then yes, I agree with you.

Best,
-jay

[1] Witness the join constructs in Golang in Kubernetes as they work 
around etcd not being a relational data store:

https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/deployment_controller.go#L528-L556

Instead of a single SQL statement:

SELECT p.* FROM pods AS p
JOIN deployments AS d
ON p.deployment_id = d.id
WHERE d.name = $name;

the deployments controller code has to read every Pod message from etcd 
and loop through each Pod message, returning a list of Pods that match 
the deployment searched for.

Similarly, Kubenetes API does not support any aggregate (SUM, GROUP BY, 
etc) functionality. Instead, clients are required to perform these kinds 
of calculations/operations in memory. This is because etcd, being an 
(awesome) key/value store is not designed for aggregate operations (just 
as Cassandra or CockroachDB do not allow most aggregate operations).

My point here is not to denigrate Kubernetes. Far from it. They (to 
date) have a relatively shallow relational schema and doing join and 
index maintenance [2] operations in client-side code has so far been a 
cost that the project has been OK carrying. The point I'm trying to make 
is that the choice of data store semantics (relational or not, columnar 
or not, eventually-consistent or not, etc) *does make a difference* to 
the architecture of a project, its deployment and the amount of code 
that the project needs to keep to properly handle its data schema. 
There's no way -- in my experience -- to make a "persistence layer" that 
papers over these differences and ends up being useful.

[2] In Kubernetes, all services are required to keep all relevant data 
in memory:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/principles.md

This means that code that maintains a bunch of in-memory indexes of 
various data objects ends up being placed into every component, Here's 
an example of this in the kubelet (the equivalent-ish of the 
nova-compute daemon) pod manager, keeping an index of pods and mirrored 
pods in memory:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L104-L114

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L159-L181



More information about the OpenStack-dev mailing list