[Openstack-operators] Rolling upgrades and Neutron

Jesse Keating jlk at bluebox.net
Wed Mar 4 21:32:51 UTC 2015


On 3/4/15 12:56 PM, Assaf Muller wrote:
> Hello everyone,
>
> An issue came up recently:
> http://lists.openstack.org/pipermail/openstack-dev/2015-March/058280.html
>
> Where a recent Kilo patch made non-backwards compatible to the RPC interface
> between the Neutron server and its agents. I'm trying to figure out how much
> of an issue that really is.
>
> The question is: Does anyone have any experience with performing a 'rolling upgrade'
> for Neutron, specifically, upgrading the Neutron API server(s) first, and upgrading
> Neutron agents later? Has anyone performed this from Icehouse to Juno successfully?
> Would this typically work across the board for other services as well?

When database migrations are involved, typically we shut down all 
producers/consumers of the database, then migrate the database, then 
bring up new code for producers/consumers.

This model works across all the services (except for swift, because... 
swift).

When database migrations are /not/ at play then the general desire is to 
do a rolling upgrade, in order to have services down for as little time 
as possible. It's not just doing all the APIs at once and then agents, 
it's doing a sub-set of APIs in a batch mode, so that the API itself is 
never 100% down. This works in Nova, where there is a concept of a 
upgrade_levels for RPC message format, and there is a conductor service 
which can be upgraded first which can handle translating internals of 
RPC messages for older services. The end scenario was that we could 
upgrade conductors first in one swoop (since they are bus consumers and 
not API points), then roll through the APIs and other services, then 
finally roll through the computes. Once everything was updated we could 
bump the upgrade_levels for compute.

Without this sort of structure for Neutron it'll be... difficult to do 
mixed versions of individual API nodes as well as mixed versions of 
agents and APIs.

Given that agents aren't API listeners, an upgrade strategy could be to 
update the agents all at once to new code that's backwards compatible 
with the old API nodes then roll through the API nodes, or vice versa. 
Roll through API nodes to get to new code that is backwards compatible 
with old agents, then update all the agents.

Either way its preferable to do things in as small of "atomic" chunks as 
possible. In large clusters, with nova, there is a 1:1 relationship 
between nova-compute and hypervisors, so anything that has to be atomic 
across compute is painful. Slow. With Neutron, depending on the setup, 
there is a similar relationship, so being able to break those up into 
batches, or at least being able to treat them at a different time from 
the public APIs is desirable.



-- 
-jlk



More information about the OpenStack-operators mailing list