[kolla-ansible][tooz][devstack] Upgrade path to etcd v3.4
Clark Boylan
cboylan at sapwetik.org
Thu Aug 10 22:09:50 UTC 2023
On Thu, Aug 10, 2023, at 6:38 AM, Jan Gutter wrote:
> Hi folks,
>
> etcd maintains two stable branches [1], at the moment these appear to
> be the v3.4 and v3.5 series.
>
> etcd does not support skip-version updates without dataloss[2] out of
> the box, operators are encouraged to do online updates.
>
> I took a quick sample of some etcd versions:
> * devstack currently uses etcd v3.3.12 (at time of writing) [3]
> * kolla uses etcd v3.3.27 [4]
> * tripleo uses rdo etcd: it appears they're already on 3.4.14 [5]
> * upstream etcd is at 3.4.27 and 3.5.9 [6]
>
> Now, the bad news:
>
> The client side endpoints of etcd is changing from `v3alpha` to `v3`,
> with a `v3beta` step added to ensure maximum confusion [7]
>
> tooz, for example, currently defaults to `v3alpha` for the recommended
> etcd3gw backend [8], but it's configurable in the client side by
> passing an extra option to the backend url.
>
> So just updating etcd from 3.3 to 3.4 usually breaks things until a
> couple of orchestration updates are made.
>
> It seems providing a smooth upgrade path would require coordination
> between orchestration, services that depend on etcd as a backend, and
> a couple of middleware libraries.
>
> The lack of a skip-version upgrade support means it's probably also
> better to do this sooner rather than later. 3.4 is already late in its
> release cycle and due to go out of maintenance soon.
>
> Options:
> * upgrade to 3.4 and use `v3` endpoint everywhere - fix forward?
If I understand correctly you can update from 3.3 to 3.4 in a safe rolling fashion. Then you can update the use of the endpoint name/version. Then you could upgrade from 3.4 to 3.5 in a rolling fashion? Seems like this is a reasonable path forward, but will take some effort.
For Devstack you don't typically need to worry about upgrading the etcd DB. That said, I wonder if Grenade complicates things. Do we upgrade services that might only be compatible with etcd 3.4 (or 3.5) that will break if the control plane continues to run etcd 3.3?
Seems like we should also update tooz to default to modern endpoints and force overrides if talking to old systems rather than override for current etcd.
> * try to detect endpoints in middleware (etcd3gw) somehow?
The only reason I would try and detect the valid endpoint is if we need to support old etcd against new cloud components (or vice versa?) in order to support upgrade paths that might upgrade openstack independently of the etcd database. Otherwise I would roll forward and try to avoid complicating tools like tooz (or the deployment orchestration that configures tooz).
> * ask ChatGPT how to farm goats
>
> Links (not permalinks!):
> [1]:
> https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/branch_management.md
> [2]: https://etcd.io/docs/v3.3/upgrades/upgrade_3_4/
> [3]:
> https://opendev.org/openstack/devstack/src/branch/master/stackrc#L723
> [4]:
> https://opendev.org/openstack/kolla/src/branch/master/docker/etcd/Dockerfile.j2#L15
> [5]:
> https://review.rdoproject.org/r/plugins/gitiles/rdoinfo/+/master/buildsys-tags/cloud9s-openstack-antelope-release.yml#154
> [6]: https://github.com/etcd-io/etcd/releases/
> [7]: https://etcd.io/docs/v3.5/dev-guide/api_grpc_gateway/
> [8]:
> https://opendev.org/openstack/tooz/src/branch/master/tooz/drivers/etcd3gw.py#L204
More information about the openstack-discuss
mailing list