[openstack-dev] [Heat] Versioned objects upgrade patterns

Crag Wolfe cwolfe at redhat.com
Tue May 17 19:40:27 UTC 2016


On 05/17/2016 10:34 AM, Michał Dulko wrote:
> On 05/17/2016 06:30 PM, Crag Wolfe wrote:
>> Hi all,
>>
>> I've read that versioned objects are favored for supporting different
>> versions between RPC services and to support rolling upgrades. I'm
>> looking to follow the pattern for Heat. Basically, it is the classic
>> problem where we want to migrate from writing to a column in one table
>> to having that column live in a different table. Looking at nova code,
>> the version for a given versioned object is a constant in the given
>> object/<the_object_name>.py file. To properly support rolling upgrades
>> where we have older and newer heat-engine processes running
>> simultaneously (thus avoiding downtime), we have to write to both the
>> old column and the new column. Once all processes have been upgraded,
>> we can upgrade again to only write to the new location (but still able
>> to read from the old location of course). Following the existing
>> pattern, this means the operator has to upgrade <the_object_name.py>
>> twice (it may be possible to increment VERSION in <the_object_name.py>
>> only once, however, the first time).
>>
>> The drawback of the above is it means cutting two releases (since two
>> different .py files). However, I wanted to check if anyone has gone
>> with a different approach so only one release is required. One way to
>> do that would be by specifying a version (or some other flag) in
>> heat.conf. Then, only one <the_object_name>.py release would be
>> required -- the logic of whether to write to both the old and new
>> location (the intermediate step) versus just the new location (the
>> final step) would be in <the_object_name>.py, dictated by the config
>> value. The advantage to this approach is now there is only one .py
>> file released, though the operator would still have to make a config
>> change and restart heat processes a second time to move from the
>> intermediate step to the final step.
> 
> Nova has the pattern of being able to do all that in one release by
> exercising o.vo, but there are assumptions they are relying on (details
> [1]):
> 
>   * nova-compute accesses the DB through nova-conductor.
>   * nova-conductor gets upgraded atomically.
>   * nova-conductor is able to backport an object if nova-compute is
>     older and doesn't understand it.
> 
> Now if you want to have heat-engines running in different versions and
> all of them are freely accessing the DB, then that approach won't work
> as there's no one who can do a backport.
> 
> We've faced same issue in Cinder and developed a way to do such
> modifications in three releases for columns that are writable and two
> releases for columns that are read-only. This is explained in spec [2]
> and devref [3]. And yes, it's a little painful.
> 
> If I got everything correctly, your idea of two-step upgrade will work
> only for read-only columns. Consider this situation:
> 
>  1. We have deployment running h-eng (A and B) in version X.
>  2. We apply X+1 migration moving column `foo` to `bar`.
>  3. We upgrade h-eng A to X+1. Now it writes to both `foo` and `bar`.
>  4. A updates `foo` and `bar`.
>  5. B updates `foo`. Now correct value is in `foo` only.
>  6. A want to read the value. But is latest one in `foo` or `bar`? No
>     way to tell that.
> 
> 
> I know Keystone team is trying to solve that with some SQLAlchemy magic,
> but I don't think the design is agreed on yet. There was a presentation
> at the summit [4] that mentions it (and attempts clarification of
> approaches taken by different projects).
> 
> Hopefully this helps a little.
> 
> Thanks,
> Michal (dulek on freenode)
> 
> [1] http://www.danplanet.com/blog/2015/10/07/upgrades-in-nova-database-migrations/
> 
> [2] http://specs.openstack.org/openstack/cinder-specs/specs/mitaka/online-schema-upgrades.html
> 
> [3] http://docs.openstack.org/developer/cinder/devref/rolling.upgrades.html#database-schema-and-data-migrations
> 
> [4] https://www.youtube.com/watch?v=ivcNI7EHyAY
> 

That helps a lot, thanks! You are right, it would have to be a 3-step
upgrade to avoid the issue you mentioned in 6.

Another thing I am wondering about: if my particular object is not
exposed by RPC, is it worth making it a full blown o.vo or not? I.e, I
can do the 3 steps over 3 releases just in the object's .py file -- what
additional value do I get from o.vo?

I'm also shying away from the idea of allowing for config-driven
upgrades. The reason is, suppose an operator updates a config, then does
a rolling restart to go from X to X+1. Then again (and probably again)
as needed. Everything works great, run a victory lap. A few weeks later,
some ansible or puppet automation accidentally blows away the config
value saying that heat-engine should be running at the X+3 version for
my_object. Ouch. Probably unlikely, but more likely than say
accidentally deploying a .py file from three releases ago.



More information about the OpenStack-dev mailing list