[openstack-dev] Online Migrations.

Andrew Laski andrew at lascii.com
Mon Jun 15 20:21:00 UTC 2015

On 06/15/15 at 03:23pm, Mike Bayer wrote:
>On 6/15/15 2:21 PM, Dan Smith wrote:
>>Tying this to the releases is less desirable from my perspective. It
>>means that landing a thing requires more than six months of developer
>>and reviewer context. We have that right now, and we get along, but it's
>>much harder to plan, execute, and cleanup those sorts of longer-lived
>>changes. It also means that CDers have to wait for the contract to be
>>landed well after they should have been able to clean up their database,
>>and may imply that people _have_ to do a contract at some point,
>>depending on how it's exposed.
>>The goal for this was to separate the three phases. Tying one of them to
>>the releases kinda hampers the utility of it to some degree, IMHO.
>>Making it declarative (even when part of what is declared are the
>>condition(s) upon which a particular contraction can proceed) is much
>>more desirable to me.
>all of these things are true.
>but i don't see how this part of things is going to be solved unless 
>you otherwise do something like #1, but maybe not as complicated as 
>Here's the deal.  If I write a program, that says this:
>class MyThing(Model):
>    __tablename__ = 'thing'
>    x = Column()
>    y = Column()
>then I say:
>print session.query(MyThing)
>it's going to run "SELECT x, y FROM thing"
>if you want MyThing to have "y" there, but the program runs in some 
>kind of mode that doesnt include "y" anymore, you can do something 
>like this:
>class MyThing(Model):
>    __tablename__ = 'thing'
>    x = Column()
>    if we_have_column('thing', 'y'):
>        y = Column()
>note that the above is totally pseudocode.    If you want it to be 
>like "y = RemovedColumn()", there is probably a way to make it work 
>that way also, e.g. that there's this declared "y = something()" in 
>your model, but the MyThing model does not actually get a "y" in it, 
>and even that "y" is written to some other collection like 
>MyThing.columns_we_have_removed (again, also pseudocode).
>Alternatively, you can have MyThing with .x and .y and then try to 
>mess around with your Query() objects so that they skip "y" when this 
>condition occurs, which at the basic level looks like:
>With this approach, you'd probably want to use a new API I've added 
>in 1.0 that allows for on-query-construction events which can add 
>these deferral rules.   Hacking this into model_query() is going to 
>be more difficult / hardcoded and also isn't going to accommodate 
>things like lazy loads, joins, eager loads, etc.   In any case, to do 
>this correctly for intercepted queries is doable but might be 
>difficult and error prone in some cases, as it has to search for all 
>entities in the query, aliased, joined, subqueried, etc. that might 
>be referring to "thing.y".   Also something has to be worked out for 
>the persistence side; it needs to be excluded from INSERT statements 
>and even UPDATE statements if some logic is setting a value for it.     
>Or you could build up some SQL execution events using the SQLAlchemy 
>event API to just scrub these columns out when the SQL is emitted, 
>but then we have to parse and rewrite SQL.
>But either way, you can have all of that.   But what is not clear 
>here is, when is that decision made, that we no longer have "y" ?
>Is it made:
>1. at runtime?  e.g. your nova service is running, it's doing "SELECT 
>x, y FROM thing", then some magic thing happens somewhere and the app 
>suddenly sees, hey "y" is gone!  change all queries to "SELECT x FROM 
>thing".     What would this magic thing be?   Are you going to run a 
>reflection of the table schema on every query (you definitely 
>aren't).   So I don't know that this is possible.

Would it be dangerous to signal that 'y' is gone by having a query fail 
and at that point the model could be updated?  In other words, is there 
a chance of a query failing in such a way as to leave data in an 
inconsistent or undesirable state?

>2. at application start time?   e.g. nova service starts up, 
>something happens before "MyThing" is first declared where MyThing 
>knows that "y" is no longer there for this run (or something that 
>will impact all the queries and persistence operations, less 
>#2 is much more possible.  But still, how does it run?   How do we 
>know that "y" is there on one run, and is not there on another?   do 
>2a.  When the app starts up, we run reflection queries against the DB 
>(e.g. what autogenerate  / OSM does, looking in schema catalogs).    
>This is doable, but can get expensive on startup if we really have 
>lots of columns/tables to worry about; it also means that either the 
>changes to the queries here happen totally at query time (intricate, 
>difficult-ish), as for the change to happen at model definition time 
>(simple, easy) means the app needs to be connected to the database 
>before it imports the models, and this is the complete opposite of 
>how Nova's api.py is constructed right now.   Plus the feature needs 
>to accommodate for Cells, where there's a totally different database 
>happening (maybe this has to be query time for that reason alone).
>2b. In a config file somewhere?   Some kind of directive that says, 
>"hey we have now dropped "thing.y".  What would that look like?
>2c. Based on some kind of version number in the database?   Not too 
>much different from #2a.
>>That said, I still think we should get the original thing merged. Even
>>if we did contractions purely with the manual migrations for the
>>foreseeable future, that'd be something we could deal with.
>>OpenStack Development Mailing List (not for usage questions)
>>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

More information about the OpenStack-dev mailing list