[Openstack-operators] Rolling upgrades and Neutron
Jesse Keating
jlk at bluebox.net
Wed Mar 4 21:32:51 UTC 2015
On 3/4/15 12:56 PM, Assaf Muller wrote:
> Hello everyone,
>
> An issue came up recently:
> http://lists.openstack.org/pipermail/openstack-dev/2015-March/058280.html
>
> Where a recent Kilo patch made non-backwards compatible to the RPC interface
> between the Neutron server and its agents. I'm trying to figure out how much
> of an issue that really is.
>
> The question is: Does anyone have any experience with performing a 'rolling upgrade'
> for Neutron, specifically, upgrading the Neutron API server(s) first, and upgrading
> Neutron agents later? Has anyone performed this from Icehouse to Juno successfully?
> Would this typically work across the board for other services as well?
When database migrations are involved, typically we shut down all
producers/consumers of the database, then migrate the database, then
bring up new code for producers/consumers.
This model works across all the services (except for swift, because...
swift).
When database migrations are /not/ at play then the general desire is to
do a rolling upgrade, in order to have services down for as little time
as possible. It's not just doing all the APIs at once and then agents,
it's doing a sub-set of APIs in a batch mode, so that the API itself is
never 100% down. This works in Nova, where there is a concept of a
upgrade_levels for RPC message format, and there is a conductor service
which can be upgraded first which can handle translating internals of
RPC messages for older services. The end scenario was that we could
upgrade conductors first in one swoop (since they are bus consumers and
not API points), then roll through the APIs and other services, then
finally roll through the computes. Once everything was updated we could
bump the upgrade_levels for compute.
Without this sort of structure for Neutron it'll be... difficult to do
mixed versions of individual API nodes as well as mixed versions of
agents and APIs.
Given that agents aren't API listeners, an upgrade strategy could be to
update the agents all at once to new code that's backwards compatible
with the old API nodes then roll through the API nodes, or vice versa.
Roll through API nodes to get to new code that is backwards compatible
with old agents, then update all the agents.
Either way its preferable to do things in as small of "atomic" chunks as
possible. In large clusters, with nova, there is a 1:1 relationship
between nova-compute and hypervisors, so anything that has to be atomic
across compute is painful. Slow. With Neutron, depending on the setup,
there is a similar relationship, so being able to break those up into
batches, or at least being able to treat them at a different time from
the public APIs is desirable.
--
-jlk
More information about the OpenStack-operators
mailing list