Open Stack

Thu Sep 1 12:29:19 UTC 2016

So as the person who drove the rolling upgrade requirements into keystone in this cycle (because we have real customers that need it), and having first written the keystone upgrade process to be “versioned object ready” (because I assumed we would do this the same as everyone else), and subsequently re-written it to be “DB Trigger ready”…and written migration scripts for both these cases for the (in fact very minor) DB changes that keystone has in Newton…I guess I should also weigh in here :-)

For me, the argument comes down to:

a) Is the pain that needs to cured by the rolling upgrade requirement broadly in the same place in the various projects (i.e. nova, glance, keystone etc.)? If it is, then working towards a common solution is always preferable (whatever that solution is)
b) I would characterise the difference between the trigger approach, the versioned objects approach and the “n-app approach as: do we want a small amount of very nasty complexity vs. spreading that complexity out to be not as bad, but over a broader area. Probably fewer people can (successfully) write the nasty complexity trigger work, than they can, say, the “do it all in the app” work. LOC (which, of course, isn’t always a good measure) is also reflected in this characterisation, with the trigger code having probably the fewest LOC, and the app code having the greatest. 
c) I don’t really follow the argument that somehow the trigger code in migrations is less desirable because we use higher level sqla abstractions in our main-line code - I’ve always seen migration as different and expected that we might have to do strange things there. Further, we should be aware of the time-preiods…the migration cycle is a small % of elapsed time the cloud is running (well, hopefully) - so again, do we solve the “issues of migration” as part of the migration cycle (which is what the trigger approach does) or make our code be (effectively) continually migration aware (using versioned objects or in-app code)
d) The actual process (for an operator) is simpler for a rolling upgrade process with Triggers than the alternative (since you don’t require several of the checkpoints, e.g. when you know you can move out of compatibility mode etc.). Operator error is also a cause of problems in upgrades (especially as the complexity of a cloud increases).

From a purely keystone perspective, my gut feeling is that actually the trigger approach is likely to lead to a more robust, not less, solution - due to the fact that we solve the very specific problems of a given migration (i.e. need to keep column A in sync with Column B) or a short period of time, right at the point of pain, with well established techniques - albeit they be complex ones that need experienced coders in those techniques. I actually prefer the small locality of complexity (marked with “there be dragons there, be careful”), as opposed to spreading medium pain over a large area, which by definition is updated by many…and  may do the wrong thing inadvertently. It is simpler for operators.

I do recognise, however, the “let’s not do different stuff for a core project like keytsone” as a powerful argument. I just don’t know how to square this with the fact that although I started in the “versioned objects camp”, having worked through many of the issues have come to believe that the Trigger approach will be more reliable overall for this specific use case. From the other reaction to this thread, I don’t detect a lot of support for the Trigger approach becoming our overall, cross-project solution.

The actual migrations in Keystone needed for Newton are minor, so one possibility is we use keystone as a guinea pig for this approach in Newton…if we had to undo this in a subsequent release, we are not talking about rafts of migration code to redo.

Henry

> On 1 Sep 2016, at 09:45, Robert Collins <robertc at robertcollins.net> wrote:
> 
> On 31 August 2016 at 01:57, Clint Byrum <clint at fewbar.com> wrote:
>> 
>> 
>> It's simple, these are the holy SQL schema commandments:
>> 
>> Don't delete columns, ignore them.
>> Don't change columns, create new ones.
>> When you create a column, give it a default that makes sense.
> 
> I'm sure you're aware of this but I think its worth clarifying for non
> DBAish folk: non-NULL values can change a DDL statements execution
> time from O(1) to O(N) depending on the DB in use. E.g. for Postgres
> DDL requires an exclusive table lock, and adding a column with any
> non-NULL value (including constants) requires calculating a new value
> for every row, vs just updating the metadata - see
> https://www.postgresql.org/docs/9.5/static/sql-altertable.html
> """
> When a column is added with ADD COLUMN, all existing rows in the table
> are initialized with the column's default value (NULL if no DEFAULT
> clause is specified). If there is no DEFAULT clause, this is merely a
> metadata change and does not require any immediate update of the
> table's data; the added NULL values are supplied on readout, instead.
> """
> 
>> Do not add new foreign key constraints.
> 
> What's the reason for this - if it's to avoid exclusive locks, I'd
> note that the other rules above don't avoid exclusive locks - again,
> DB specific, and for better or worse we are now testing on multiple DB
> engines via 3rd party testing.
> 
> https://dev.launchpad.net/Database/LivePatching has some info from our
> experience doing online and very fast offline patches in Launchpad.
> 
> -Rob
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Open Stack

[openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

OpenStack

Community

Documentation

Branding & Legal