Open Stack

Fri Oct 21 00:02:28 UTC 2016

At Summit, folks will be discussing the rolling upgrade issue across a
couple of sessions. I personally won't be able to attend, but thought
I would share my thoughts on the subject.

To handle rolling upgrades, there are two general cases to consider:
database model changes and RPC method signature changes.

For DB Model changes (this has already been well discussed on the
mailing list, see the footnotes), let's assume for the moment we don't
want to use triggers. If we are moving data from one column/table to
another, the pattern looks like:

legacy release: write to old location
release+1: write to old and new location, read from old
release+2: write to old and new location, read from new,
           provide migration utility
release+3: write to new location, read from new

Works great! The main issue is if the duplicated old and new data
happens to be large. For a heat-specific example (one that is close to
my heart), consider moving resource/event properties data into a
separate table.

We could speed up the process by adding config variables that specify
where to read from, but that is putting a burden on the operator,
creating a risk that data is lost if the config variables are not
updated in the correct order after each full rolling restart, etc.

Which brings us back to triggers. AFAIK, only sqlalchemy+mariadb is
being used in production, so we really only have one backend we would
have to write triggers for. If the data duplication is too unpalatable
for a given migration (using the +1, +2, +3 pattern above), we may
have to wade into the less simple world of triggers.

For RPC changes, we don't have a great solution right now (looking
specifically at heat/engine/service.py). If we add a field, an older
running heat-engine will break if it receives a request from a newer
running heat-engine. For a relevant example, consider adding the
"root_id" as an argument (
https://review.openstack.org/#/c/354621/13/heat/engine/service.py ).

Looking for the simplest solution -- if we introduce a mandatory
"future_args" arg (a dict) now to all rpc methods (perhaps provide a
decorator to do so), then we could follow this pattern post-Ocata:

legacy release: accepts the future_args param (but does nothing with it).
release+1: accept the new parameter with a default of None,
           pass the value of the new parameter in future_args.
release+2: accept the new parameter, pass the value of the new parameter
           in its proper placeholder, no longer in future_args.

But, we don't have a way of deleting args. That's not super
awful... old args never die, they just eventually get ignored. As for
adding new api's, the pattern would be to add them in release+1, but
not call them until release+2. [If we really have a case where we need
to add and use a new api in release+1, the solution may be to have two
rpc api messaging targets in release+1, one for the previous
major.minor release and another for the major+1.0 release that has the
new api. Then, we of course we could remove outdated args in
major+1.0.]

Finally, a note about Oslo versioned objects: they don't really help
us. They work great for nova where there is just nova-conductor
reading and writing to the DB, but we have multiple heat-engines doing
that that need to be restarted in a rolling manner. See the references
below for greater detail.

--Crag

References
----------

[openstack-dev] [Heat] Versioned objects upgrade patterns
http://lists.openstack.org/pipermail/openstack-dev/2016-May/thread.html#95245

[openstack-dev] [keystone][nova][neutron][all] Rolling upgrades:
database triggers and oslo.versionedobjects
http://lists.openstack.org/pipermail/openstack-dev/2016-September/102698.html
http://lists.openstack.org/pipermail/openstack-dev/2016-October/105764.html

Open Stack

[openstack-dev] [heat] Rolling Upgrades

OpenStack

Community

Documentation

Branding & Legal