[openstack-dev] [heat] Rolling Upgrades

Michał Dulko michal.dulko at intel.com
Fri Oct 21 12:37:46 UTC 2016


On 10/21/2016 02:02 AM, Crag Wolfe wrote:
> At Summit, folks will be discussing the rolling upgrade issue across a
> couple of sessions. I personally won't be able to attend, but thought
> I would share my thoughts on the subject.
>
> To handle rolling upgrades, there are two general cases to consider:
> database model changes and RPC method signature changes.
>
> For DB Model changes (this has already been well discussed on the
> mailing list, see the footnotes), let's assume for the moment we don't
> want to use triggers. If we are moving data from one column/table to
> another, the pattern looks like:
>
> legacy release: write to old location
> release+1: write to old and new location, read from old
> release+2: write to old and new location, read from new,
>            provide migration utility
> release+3: write to new location, read from new
>
> Works great! The main issue is if the duplicated old and new data
> happens to be large. For a heat-specific example (one that is close to
> my heart), consider moving resource/event properties data into a
> separate table.
>
> We could speed up the process by adding config variables that specify
> where to read from, but that is putting a burden on the operator,
> creating a risk that data is lost if the config variables are not
> updated in the correct order after each full rolling restart, etc.
>
> Which brings us back to triggers. AFAIK, only sqlalchemy+mariadb is
> being used in production, so we really only have one backend we would
> have to write triggers for. If the data duplication is too unpalatable
> for a given migration (using the +1, +2, +3 pattern above), we may
> have to wade into the less simple world of triggers.

I just wanted to remind that Heat has unit test [2] which is blocking
contracting DB migrations.

> For RPC changes, we don't have a great solution right now (looking
> specifically at heat/engine/service.py). If we add a field, an older
> running heat-engine will break if it receives a request from a newer
> running heat-engine. For a relevant example, consider adding the
> "root_id" as an argument (
> https://review.openstack.org/#/c/354621/13/heat/engine/service.py ).
>
> Looking for the simplest solution -- if we introduce a mandatory
> "future_args" arg (a dict) now to all rpc methods (perhaps provide a
> decorator to do so), then we could follow this pattern post-Ocata:
>
> legacy release: accepts the future_args param (but does nothing with it).
> release+1: accept the new parameter with a default of None,
>            pass the value of the new parameter in future_args.
> release+2: accept the new parameter, pass the value of the new parameter
>            in its proper placeholder, no longer in future_args.
>
> But, we don't have a way of deleting args. That's not super
> awful... old args never die, they just eventually get ignored. As for
> adding new api's, the pattern would be to add them in release+1, but
> not call them until release+2. [If we really have a case where we need
> to add and use a new api in release+1, the solution may be to have two
> rpc api messaging targets in release+1, one for the previous
> major.minor release and another for the major+1.0 release that has the
> new api. Then, we of course we could remove outdated args in
> major+1.0.]

Another solution is adopting Nova's and Cinder's way. You need some kind
of RPC version reporting and detection framework. In Cinder it's
reported into `services` table [1], and supported RPC API version is
detected [2] based on that data. Then requests are backported into
required version on RPC client level (e.g. [3]).

> Finally, a note about Oslo versioned objects: they don't really help
> us. They work great for nova where there is just nova-conductor
> reading and writing to the DB, but we have multiple heat-engines doing
> that that need to be restarted in a rolling manner. See the references
> below for greater detail.

They do help in case you're changing RPC arguments *content*. In
particular they make it easier to modify schema of dict-like structures
sent over RPC.

> --Crag
>
> References
> ----------
>
> [openstack-dev] [Heat] Versioned objects upgrade patterns
> http://lists.openstack.org/pipermail/openstack-dev/2016-May/thread.html#95245
>
> [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades:
> database triggers and oslo.versionedobjects
> http://lists.openstack.org/pipermail/openstack-dev/2016-September/102698.html
> http://lists.openstack.org/pipermail/openstack-dev/2016-October/105764.html

[1] https://github.com/openstack/heat/blob/master/heat/tests/db/test_migrations.py#L114-L137
[2] https://github.com/openstack/cinder/blob/325f99a64aeb3e7a768904781d854c19bb540580/cinder/db/sqlalchemy/models.py#L86-L89
[3] https://github.com/openstack/cinder/blob/8a4aecb155478e9493f4d36b080ccdf6be406eba/cinder/rpc.py#L208-L224
[4] https://github.com/openstack/cinder/blob/7a2adc08c75414a25dc22bdc74790b68bd749c45/cinder/volume/rpcapi.py#L305-L307




More information about the OpenStack-dev mailing list