It didn't take that long to evaluate. Unfortunately, this approach
doesn't work for me. I tried only upgrading the neutron-server
package, but there are dependencies for the other neutron agents, so
they are upgraded as well. I could reduce the downtime of the
L3-agent, though.
Since this isn't a recurring issue (upgrades in general, but also db
schema changes), we'll stick with our current upgrade procedure.
But I'm still voting for adding db schema changes to the release notes.
Thanks again,
Eugen
Zitat von Eugen Block <eblock@nde.ag>:
> Hi,
>
> thanks for sharing!
> I'll have to adapt my upgrade procedure and test it properly. This
> could take a while, though.
>
> Zitat von Tobias Urdin - Binero IT <tobias.urdin@binero.com>:
>
>> Hello,
>>
>> In more detail this is the procedure we’re using and we recently
>> upgraded two times first from
>> Zed to Antelope, then from Antelope to Caracal.
>>
>> - Install new version of Neutron and run database expand
>>
>> - Upgrade neutron-server on all “controller” nodes
>>
>> - Run database contract
>>
>> - Upgrade OVS, L3, Metadata, DHCP agents on network nodes (on
>> controller nodes in some peoples setups)
>>
>> - First OVS and then wait for it to start correctly
>>
>> - Stop DHCP, L3, Metadata (in that order)
>>
>> - Upgrade agents and start in same order as above
>>
>> - Upgrade OVS agent on compute nodes
>>
>> Happy to take feedback if there is improvement possible on the above
>>
>> From what I remember during all these years we’ve only had issues
>> with upgrades twice, once
>> was a keepalived bug and another was when Neutron translated to
>> primary/backup wording
>> for L3 HA which I think could also be that we did a double jump
>> upgrade causing us to miss
>> some translation patch somewhere or similar.
>>
>> /Tobias
>>
>>> On 21 Mar 2025, at 15:44, Eugen Block <eblock@nde.ag> wrote:
>>>
>>> Thanks for your quick response, appreciate it!
>>> I've read that page as well, but that's been a while. I guess I
>>> didn't pay too much attention since the recent upgrades all went
>>> well. Until now, I just ran 'apt upgrade' on the first node, which
>>> would upgrade all packages, of course, did an expand and the
>>> contract command was issued on the last control node.
>>>
>>> So what would be the ideal way? First upgrade only neutron-server
>>> and l2 agents on all control node ('apt upgrade --only-upgrade
>>> <neutron-server|openvswitch-agent>'), then expand and contract,
>>> and then upgrade the rest of the packages?
>>>
>>>
>>> Zitat von Tobias Urdin - Binero IT <tobias.urdin@binero.com>:
>>>
>>>> Hello,
>>>>
>>>> We upgrade in a very specific order as mentioned in [1], so first
>>>> database expand, then all neutron-server
>>>> applications is upgraded first, then contract, before any agents.
>>>>
>>>> [1]
>>>> https://docs.openstack.org/neutron/latest/contributor/internals/upgrade.html
>>>>
>>>> /Tobias
>>>>
>>>>> On 21 Mar 2025, at 15:12, Eugen Block <eblock@nde.ag> wrote:
>>>>>
>>>>> Hi *,
>>>>>
>>>>> maybe I missed some announcement or something, but usually, I
>>>>> read the release notes [0] before upgrading our OpenStack cloud.
>>>>> I didn't notice anything regarding DB schema upgrades. And after
>>>>> the upgrade from Yoga to Zed in a test environment went well, I
>>>>> tried the same in our production today. Note that I didn't have
>>>>> a router in my test cloud, so that's probably why I didn't
>>>>> notice anything.
>>>>>
>>>>> Unfortunately, there has been a schema change, that's why the
>>>>> l3-agent failed to start properly with this error:
>>>>>
>>>>> 2025-03-21 12:29:14.527 846393 CRITICAL neutron [None
>>>>> req-e225ff0a-82e1-473b-9eba-9a11caa7ace7 - - - - - -] Unhandled
>>>>> error: oslo_messaging.rpc.client.RemoteError: Remote error:
>>>>> OperationalError (pymysql.err.OperationalError) (1054, "Unknown
>>>>> column 'portforwardings.external_port' in 'SELECT'")
>>>>>
>>>>> Indeed, the upgraded control node didn't have "external_port"
>>>>> anymore in
>>>>> /usr/lib/python3/dist-packages/neutron/db/models/port_forwarding.py,
>>>>> while the not yet upgraded control node did. So the situation
>>>>> could only be resolved by proceeding with the upgrade. But that
>>>>> meant an interruption for our virtual routers, causing floating
>>>>> IPs to be unreachable for a couple of minutes.
>>>>>
>>>>> Note that we're using highly-available routers. I thought about
>>>>> setting "no-ha" for each router, but that can only be done for
>>>>> disabled routers, which is not an option, of course. And it
>>>>> doesn't really fit into the "rolling upgrade" concept, which has
>>>>> worked great so far. Since we moved to Ubuntu last September
>>>>> (while still on Victoria), we've been able to upgrade to Yoga
>>>>> without any issues.
>>>>>
>>>>> And while the interruption today was not too critical, I was
>>>>> still surprised that such an important change didn't even make
>>>>> it into the Zed release notes. Was that a mistake or did I miss
>>>>> something? Are there other places I need to check before
>>>>> attempting an upgrade?
>>>>>
>>>>> Thanks,
>>>>> Eugen
>>>>>
>>>>> [0] https://docs.openstack.org/releasenotes/neutron/zed.html
>>>>>
>>>
>>>
>>>