[openstack-dev] [tc] Active or passive role with our database layer

Chris Dent cdent+os at anticdent.org
Tue May 23 11:23:29 UTC 2017


On Sun, 21 May 2017, Monty Taylor wrote:

> As the discussion around PostgreSQL has progressed, it has come clear to me 
> that there is a decently deep philosophical question on which we do not 
> currently share either definition or agreement. I believe that the lack of 
> clarity on this point is one of the things that makes the PostgreSQL 
> conversation difficult.

Good analysis. I think this does hit to at least some of the core
differences, maybe even most. And as with so many other things we do
in OpenStack, because we have landed somewhere in the middle between
the two positions we find ourselves in a pickle (see, for example,
the different needs for and attitudes to orchestration underlying
this thread [1]).

You're right to say we need to pick one and move in that direction
but our standard struggles with reaching agreement across the entire
community, especially on an opinionated position, will need to be
overcome. Writing about it to make it visible is a good start.

> In the "external" approach, we document the expectations and then write the 
> code assuming that the database is set up appropriately. We may provide some 
> helper tools, such as 'nova-manage db sync' and documentation on the sequence 
> of steps the operator should take.
>
> In the "active" approach, we still document expectations, but we also 
> validate them. If they are not what we expect but can be changed at runtime, 
> we change them overriding conflicting environmental config, and if we can't, 
> we hard-stop indicating an unsuitable environment. Rather than providing 
> helper tools, we perform the steps needed ourselves, in the order they need 
> to be performed, ensuring that they are done in the manner in which they need 
> to be done.

I think there's a middle ground here which is "externalize but
validate" which is:

* document expectations
* validate them
* do _not_ change at runtime, but tell people what's wrong

> Some operations have one and only one "right" way to be done. For those 
> operations if we take an 'active' approach, we can implement them once and 
> not make all of our deployers and distributors each implement and run them. 
> However, there is a cost to that. Automatic and prescriptive behavior has a 
> higher dev cost that is proportional to the number of supported 
> architectures. This then implies a need to limit deployer architecture 
> choices.

That "higher dev cost" is one of my objections to the 'active'
approach but it is another implication that worries me more. If we
limit deployer architecture choices at the persistence layer then it
seems very likely that we will be tempted to build more and more
power and control into the persistence layer rather than in the
so-called "business" layer. In my experience this is a recipe for
ossification. The persistence layer needs to be dumb and
replaceable.

> On the other hand, taking an 'external' approach allows us to federate the 
> work of supporting the different architectures to the deployers. This means 
> more work on the deployer's part, but also potentially a greater amount of 
> freedom on their part to deploy supporting services the way they want. It 
> means that some of the things that have been requested of us - such as easier 
> operation and an increase in the number of things that can be upgraded with 
> no-downtime - might become prohibitively costly for us to implement.

That's not necessarily the case. Consider that in an external
approach, where the persistence layer is opaque to the application, it
means that third parties (downstream consumers, the market, the
invisible hand, etc) have the option to do all kinds of wacky stuff.
Probably avec containers™.

In that model, the core functionality is simple and adequate but not
deluxe. Deluxe is an after-market add on.

> BUT - without a decision as to what our long-term philosophical intent in 
> this space is that is clear and understandable to everyone, we cannot have 
> successful discussions about the impact of implementation choices, since we 
> will not have a shared understanding of the problem space or the solutions 
> we're talking about.

Yes.

> For my part - I hear complaints that OpenStack is 'difficult' to operate and 
> requests for us to make it easier. This is why I have been advocating some 
> actions that are clearly rooted in an 'active' worldview.

If OpenStack were more of a monolith instead of a system with 3 to
many different databases, along with some optional number of other
ways to do other kinds of (short term) persistence, I would find the
'active' model a good option. If we were to start over I'd say let's
do that.

But as it stands implementing actually useful 'active' management of
the database feels like a very large amount of work that will take
so long that by the time we complete it it will be not just out of
date but also limit us.

External but validate feels much more viable. What we really want is
that people can get reasonably good results without trying that hard
and great (but also various) results with a bit of effort.

So that means it ought to be possible to do enough OpenStack to
think it is cool with whatever database I happen to have handy. And
then once I dig it I should be able to manage it effectively using
the solutions that are best for my environment.

> Finally, this is focused on the database layer but similar questions arise in 
> other places. What is our philosophy on prescriptive/active choices on our 
> part coupled with automated action and ease of operation vs. expanded choices 
> for the deployer at the expense of configuration and operational complexity. 
> For now let's see if we can answer it for databases, and see where that gets 
> us.

I continue to think that this issue is somewhat special at the
persistence layer because of the balance of who it impacts the most:
the deployers, developers, and distributors more than the users[2].
Making global conclusions about external and active based on this
issue may be premature.

> Thanks for reading.

Thanks for writing. You've done a lot of writing lately. Is good.

[1] http://lists.openstack.org/pipermail/openstack-operators/2017-May/013464.html

[2] That our database choices impacts the users (e.g., the case and encoding
things at the API layer) is simply a mistake that we all made together, a
bug to be fixed, not an architectural artifact.
-- 
Chris Dent                  ┬──┬◡ノ(° -°ノ)       https://anticdent.org/
freenode: cdent                                         tw: @anticdent


More information about the OpenStack-dev mailing list