Hello Thierry (and all others), First, thanks for the recap. On Fri, Nov 5, 2021, at 15:26, Thierry Carrez wrote:
The (long) document below reflects the current position of the release management team on a popular question: should the OpenStack release cadence be changed? Please note that we only address the release management / stable branch management facet of the problem. There are other dimensions to take into account (governance, feature deprecation, supported distros...) to get a complete view of the debate.
I think it's time to have a conversation with all the parties to progress forward. Take more than one dimension into account. It would be sad if we can't progress all together.
The main pressure to release more often is to make features available to users faster. Developers get a faster feedback loop, hardware vendors ensure software is compatible with their latest products, and users get exciting new features. "Release early, release often" is a best practice in our industry -- we should generally aim at releasing as often as possible.
My view is that we are in a place where openstack projects are very well tested together nowadays. This test coverage reduce the need of "coordinated releases" with larger testing... to a point that some operators are (al)ready to consume master branch. So, for those in need of the latest features early (in a long term fashion), there are two choices, regardless of release & branching cycle: Stay on that rolling forward branch (master), or manage your own fork of the code. That choice wasn't really possible in the early days of openstack without taking larger risks.
But that is counterbalanced by pressure to release less often. From a development perspective, each release cycle comes with some process overhead. On the integrators side, a new release means packaging and validation work. On the users side, it means pressure to upgrade. To justify that cost, there needs to be enough user-visible benefit (like new features) in a given release.
Very good summary.
For the last 10 years for OpenStack, that balance has been around six months. Six months let us accumulate enough new development that it was worth upgrading to / integrating the new version, while giving enough time to actually do the work. It also aligned well with Foundation events cadence, allowing to synchronize in-person developer meetings date with start of cycles.
I think we're hitting something here.
The major recent change affecting this trade-off is that the pace of new development in OpenStack slowed down. The rhythm of changes was divided by 3 between 2015 and 2021, reflecting that OpenStack is now a mature and stable solution, where accessing the latest features is no longer a major driver. That reduces some of the pressure for releasing more often. At the same time, we have more users every day, with larger and larger deployments, and keeping those clusters constantly up to date is an operational challenge. That increases the pressure to release less often. In essence, OpenStack is becoming much more like a LTS distribution than a web browser -- something users like moving slow.
Over the past years, project teams also increasingly decoupled individual components from the "coordinated release". More and more components opted for an independent or intermediary-released model, where they can put out releases in the middle of a cycle, making new features available to their users. This increasingly opens up the possibility of a longer "coordinated release" which would still allow development teams to follow "release early, release often" best practices. All that recent evolution means it is (again) time to reconsider if the 6-month cadence is what serves our community best, and in particular if a longer release cadence would not suit us better.
Again, thanks to the increase in testability (projects tested together), it could be time for us to step away from the whole model of coordinated release, which is IMO part of this problem. I feel it's okay for a project to release when they are ready/have something to release. What's holding us up to do that? Again, if we stop pushing this "artificial" release model, we'll stop the branching efforts causing overhead work. It's not bringing any value to the ecosystem anymore.
While releasing less often would definitely reduce the load on the release management team, most of the team work being automated, we do not think it should be a major factor in motivating the decision. We should not adjust the cadence too often though, as there is a one-time cost in switching our processes. In terms of impact, we expect that a switch to a longer cycle will encourage more project teams to adopt a "with-intermediary" release model (rather than the traditional "with-rc" single release per cycle), which may lead to abandoning the latter, hence simplifying our processes. Longer cycles might also discourage people to commit to PTL or release liaison work. We'd probably need to manage expectations there, and encourage more frequent switches (or create alternate models).
I feel it's okay to reduce the cadence of a 'coordinated release' to a year, from a consumer perspective. However, I think it's not the right path forward _without other changes_ (see my comment above, and the reduction of the amount of branches). If the release work didn't change from when I was still in the team, having a longer cycle means more patches to review inside a single release. Of course, less activity in OpenStack have a good counter-balance effect in here. I just believe it's better to release _more often_ than not. But _branching_ should be reduced as much as possible, that's the costly part (to what I have seen, tell me if I am wrong). I don't see any value in making longer releases for the sake of it. I don't see the reason of multiplying the amount of branches and upgrade paths to maintain.
If the decision is made to switch to a longer cycle, the release management team recommends to switch to one year directly. That would avoid changing it again anytime soon, and synchronizing on a calendar year is much simpler to follow and communicate. We also recommend announcing the change well in advance. We currently have an opportunity of making the switch when we reach the end of the release naming alphabet, which would also greatly simplify the communications around the change.
Wouldn't it be easier to completely reduce the branching, and branching only when necessary, and let projects branch when they need to? If we define strict rules for branching (and limit the annoying bits for the consumers), it will increase the quality of the ecosystem IMO. It will also be easier to manage from a packager perspective. Next to that indeed, a "coordinated release" once a year sounds a good idea, for our users ("I am using OpenStack edition 2021").
Finally, it is worth mentioning the impact on the stable branch work. Releasing less often would likely impact the number of stable branches that we keep on maintaining, so that we do not go too much in the past (and hit unmaintained distributions or long-gone dependencies). We currently maintain releases for 18 months before they switch to extended maintenance, which results in between 3 and 4 releases being maintained at the same time. We'd recommend switching to maintaining one-year releases for 24 months, which would result in between 2 and 3 releases being maintained at the same time. Such a change would lead to longer maintenance for our users while reducing backporting work for our developers.
With people churn, the work will be even harder to maintain. I think however it's delaying the problem: We are not _fixing_ the base need. Managing upgrades of 2/3 releases of a complete openstack stack of projects would be an increased effort for maintainers, just done less frequently. For maintainers, it makes more sense to phase work organically, based on project needs. If you are thinking distros, having to manage all the work when a release is out is far more coordination than if things were released over time. My experience at SUSE was that the branching model is even debatable: It was more work, and after all, we were taking the code we wanted, and put our patches on top if those didn't make upstream/weren't backported on time for x reasons (valid or not ;)). So basically, for me, the stable branches have very little value nowdays from the community perspective (it would be good enough if everybody is fixing master, IMO). I am not sure I am the only one seeing it that way. I still feel it's worth documenting.
From the "refstack" (or whatever it's called now) perspective, an 'OpenStack Powered Platform xx' is still possible with this model. We need to define a yearly baseline of the versions of the software we expect, the APIs that those software expose, and the testing around them. No need for branching, "release often" still work, projects are autonomous/owner of their destiny, and we keep the coordination.
Sorry for the long post for only my $0.02 ;) Regards, Jean-Philippe Evrard (evrardjp)