[openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

Clint Byrum clint at fewbar.com
Tue Aug 30 13:57:57 UTC 2016


Excerpts from Mike Bayer's message of 2016-08-26 11:50:24 -0400:
> 
> On 08/25/2016 01:13 PM, Steve Martinelli wrote:
> > The keystone team is pursuing a trigger-based approach to support
> > rolling, zero-downtime upgrades. The proposed operator experience is
> > documented here:
> >
> >   http://docs.openstack.org/developer/keystone/upgrading.html
> >
> > This differs from Nova and Neutron's approaches to solve for rolling
> > upgrades (which use oslo.versionedobjects), however Keystone is one of
> > the few services that doesn't need to manage communication between
> > multiple releases of multiple service components talking over the
> > message bus (which is the original use case for oslo.versionedobjects,
> > and for which it is aptly suited). Keystone simply scales horizontally
> > and every node talks directly to the database.
> 
> 
> Hi Steve -
> 
> I'm a strong proponent of looking into the use of triggers to smooth 
> upgrades between database versions.    Even in the case of projects 
> using versioned objects, it still means a SQL layer has to include 
> functionality for both versions of a particular schema change which 
> itself is awkward.   I'm also still a little worried that not every case 
> of this can be handled by orchestration at the API level, and not as a 
> single SQL layer method that integrates both versions of a schema change.
> 

Speaking as an operator, I'd rather have awkwardness happen in safe, warm
development, rather than in the cold, dirty, broken world of operations.

Speaking as a former DBA: Triggers introduce emergent behaviors and
complicate scaling and reasonable debugging in somewhat hidden ways that
can frustrate even the most experienced DBA. We've discussed FK's before,
and how they are a 1:1 trade-off of integrity vs. performance, and thus
deserve more scrutiny than they're typically given. Well IMO, triggers are
a 1:10 trade off between development complexity, and debugging complexity.

Speaking as a developer: Every case can in fact be handled simply and
in code without the database's help if we're willing to accept a small
level of imperfection and redundancy.

> Using triggers would resolve the issue of SQL-specific application code 
> needing to refer to two versions of a schema at once, at least for those 
> areas where triggers and SPs can handle it.   In the "ideal", it means 
> all the Python code can just refer to one version of a schema, and nuts 
> and bolts embedded into database migrations would handle all the 
> movement between schema versions, including the phase between expand and 
> contract.   Not that I think the "ideal" is ever going to be realized 
> 100%, but maybe in some / many places, this can work.
> 

As someone else brought up, this is an unnecessarily bleak view of how database
migrations work.

It's simple, these are the holy SQL schema commandments:

Don't delete columns, ignore them.
Don't change columns, create new ones.
When you create a column, give it a default that makes sense.
Do not add new foreign key constraints.

Following these commandments, one can run schema changes at any time. A
new schema should be completely ignorable by older code, because their
columns keep working, and no new requirements are introduced. New code
can deal with defaulted new columns gracefully.

Of course, once one can be certain that all app code is updated, one can
drop old columns and tables, and add FK constraints (if you so desire,
I personally think they're a waste of precious DB resources, but that
is a much more religious debate and I accept that it's not part of
this debate).

> So if Keystone wants to be involved in paving the way for working with 
> triggers, IMO this would benefit other projects in that they could 
> leverage this kind of functionality in those places where it makes sense.
> 
> The problem of "zero downtime database migrations" is an incredibly 
> ambitious goal and I think it would be wrong to exclude any one 
> particular technique in pursuing this.  A real-world success story would 
> likely integrate many different techniques as they apply to specific 
> scenarios, and triggers and SPs IMO are a really major one which I 
> believe can be supported.
> 

I don't think it's all that ambitious to think we can just use tried and
tested schema evolution techniques that work for everyone else.

> >
> > Database triggers are obviously a new challenge for developers to write,
> > honestly challenging to debug (being side effects), and are made even
> > more difficult by having to hand write triggers for MySQL, PostgreSQL,
> > and SQLite independently (SQLAlchemy offers no assistance in this case),
> > as seen in this patch:
> 
> So I would also note that we've been working on the availability of 
> triggers and stored functions elsewhere, a very raw patch that is to be 
> largely rolled into oslo.db is here:
> 
> https://review.openstack.org/#/c/314054/
> 
> This patch makes use of an Alembic pattern called "replaceable object", 
> which is intended specifically as a means of versioning things like 
> triggers and stored procedures:
> 
> http://alembic.zzzcomputing.com/en/latest/cookbook.html#replaceable-objects
> 
> Within the above Neutron patch, one thing I want to move towards is that 
> things like triggers and SPs would only need to be specified once, in 
> the migration layer, and not within the model.   To achieve this, tests 
> that work against MySQL and Postgresql would need to ensure that the 
> test schema is built up using migrations, and not create_all.  This is 
> already the case in some places and not in others.  There is work 
> ongoing in oslo.db to provide a modernized fixture system that supports 
> enginefacade cleanly as well as allows for migrations to be used 
> efficiently (read: once per many tests) for all MySQL/Postgresql test 
> suites, athttps://review.openstack.org/#/c/351411/ .
> 
> As far as SQLite, I have a simple opinion with SQLite which is that 
> migrations, triggers, and SPs should not be anywhere near a SQLite 
> database.   SQLite should be used strictly for simple model unit tests, 
> the schema is created using create_all(), and that's it.   The test 
> fixture system accommodates this as well.
> 

Agreed on this. Spin up a DB server in the functional tests if you want
to test any actual data manipulation.

> >
> > Our primary concern at this point are how to effectively test the
> > triggers we write against our supported database systems, and their
> > various deployment variations. We might be able to easily drop SQLite
> > support (as it's only supported for our own test suite), but should we
> > expect variation in support and/or actual behavior of triggers across
> > the MySQLs, MariaDBs, Perconas, etc, of the world that would make it
> > necessary to test each of them independently? If you have operational
> > experience working with triggers at scale: are there landmines that we
> > need to be aware of? What is it going to take for us to say we support
> > *zero* dowtime upgrades with confidence?
> 
> *zero* downtime is an extremely difficult goal.   I appreciate that 
> people are generally nervous about making more use of relational 
> database features in order to help with this, however as long as the 
> goal includes an application that can communicate with a database that 
> is literally in flux as far as its schema, this is already an exotic 
> goal.    Triggers and stored procedures are in fact very boring.
>

Even more boring is just testing upgraded schemas with old versions of
code.



More information about the OpenStack-dev mailing list