[release] decentralising release approvals
Hi, If this topic has been discussed many times already then someone please say so and we can leave it alone. As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive). Related to the recent discussion about decentralising stable branch maintenance, should we do this for releases too? To be clear, I'm proposing giving project teams more control over their own releases. I brought this up in IRC recently and there was a brief discussion. I suggested that releases are hard to undo, and Monty corrected me by saying they are impossible to undo. Still, if teams (or a subset of their members) had more ownership of their releases, they would become more familiar with the process, and the risk would be reduced. I'm not suggesting that release team should be disbanded - there is clearly work to do to maintain the tooling, determine and communicate the schedule, and more. But there's a lot of toil involved in checking and approving patches to the releases repo, and it's done by some of our most senior, busy colleagues. There are a number of checks that are performed automatically by the tooling and used to gate merges. I have a few questions for the release team about these reviews. * What manual checks do you do beyond those that are currently automated? * Could the above checks be automated? * What issues have you caught that were not caught by CI jobs? Hopefully I haven't offended anyone here. There's often more involved with these things than you first suspect. Cheers, Mark
Mark Goddard wrote:
[...] As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive).
I agree we've seen an increase in processing delay lately, and I'd like to correct that. There are generally three things that would cause a perceptible delay in release processing... 1- wait for two release managers +2 This is something we put in place some time ago, as we had a lot of new members and thought that would be a good way to onboard them. Lately it created delays as a lot of those were not as active. 2- stable releases Two subcases in there... Eitherthe deliverable is under stable policy and there are *significant* delays there as we have to pause to give a chance to stable-maint-core people to voice an opinion. Or the deliverable is not under stable policy, but we do a manual check on the changes, as a way to educate the requester on semver. 3- waiting for PTL/release liaison to approve That can take a long time, but the release management team is not really at fault there. Could you describe where you've seen "sometimes patches can hang around for a while"? I suspect they belong in the (2) category?
[...] I have a few questions for the release team about these reviews.
* What manual checks do you do beyond those that are currently automated?
See https://releases.openstack.org/reference/reviewer_guide.html
* Could the above checks be automated?
We aggressively automate everything that can be. Like I'm currently working to automate the check that the release was approved by the PTL or release liaison.
* What issues have you caught that were not caught by CI jobs?
It's generally semver violations, or timing issues (like requesting a release during a freeze). Sometimes it's corner cases not handled (yet) by automation, like incompatibility between the release version asked and the deliverable release model. You can look at the history of releases for examples.
Hopefully I haven't offended anyone here. There's often more involved with these things than you first suspect.
Decentralizing would be a lot of work to create new systems and processes... and I don't think we can automate everything. It's unreasonable to expect everyone to know the release process by heart and respect timing and freezes. And releases are the only thing we produce that we can't undo. I would rather eliminate the issue by making sure release processing is back to fast. So here is my proposal: - go back to single release manager approval - directly approve stable releases after a cursory semver check, not waiting for stable-maint-core approval. That should make sure all releases are processed within a couple of days, which I think is a good trade-off between retaining some releases for 10+ days and not having a chance to catch odd cases before releases at all. Thoughts? -- Thierry Carrez (ttx)
On 12/20/2019 4:04 AM, Thierry Carrez wrote:
- directly approve stable releases after a cursory semver check, not waiting for stable-maint-core approval.
Rather than go this drastic, how about just not waiting for stable-maint-core (tonyb or myself) to review a stable branch release proposal but if you're going to decentralize stable core to be per-project rather than stable-maint-core, then do like the non-stable release request PTL/liaison thing and ack once one of the per-project stable maint cores signs off on the release or is the person requesting the stable branch release? IOW, don't go from super high standard stable policy release review check to no check, but push the burden to per-project stable cores since they want the responsibility and this is part of the deal. -- Thanks, Matt
On Fri, 20 Dec 2019 at 15:17, Matt Riedemann <mriedemos@gmail.com> wrote:
On 12/20/2019 4:04 AM, Thierry Carrez wrote:
- directly approve stable releases after a cursory semver check, not waiting for stable-maint-core approval.
Rather than go this drastic, how about just not waiting for stable-maint-core (tonyb or myself) to review a stable branch release proposal but if you're going to decentralize stable core to be per-project rather than stable-maint-core, then do like the non-stable release request PTL/liaison thing and ack once one of the per-project stable maint cores signs off on the release or is the person requesting the stable branch release?
IOW, don't go from super high standard stable policy release review check to no check, but push the burden to per-project stable cores since they want the responsibility and this is part of the deal.
Are there many cases where the PTL/release liaison (who must approve a release) isn't also in the per-project stable team?
--
Thanks,
Matt
On 12/20/2019 2:53 PM, Mark Goddard wrote:
Are there many cases where the PTL/release liaison (who must approve a release) isn't also in the per-project stable team?
Many? I don't know about many, but yes there are or have been quite a few projects where the PTL has never done a stable branch review before so they aren't on the stable branch core team. -- Thanks, Matt
On Fri, 20 Dec 2019, 10:06 Thierry Carrez, <thierry@openstack.org> wrote:
Mark Goddard wrote:
[...] As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive).
I agree we've seen an increase in processing delay lately, and I'd like to correct that. There are generally three things that would cause a perceptible delay in release processing...
1- wait for two release managers +2
This is something we put in place some time ago, as we had a lot of new members and thought that would be a good way to onboard them. Lately it created delays as a lot of those were not as active.
2- stable releases
Two subcases in there... Eitherthe deliverable is under stable policy and there are *significant* delays there as we have to pause to give a chance to stable-maint-core people to voice an opinion. Or the deliverable is not under stable policy, but we do a manual check on the changes, as a way to educate the requester on semver.
3- waiting for PTL/release liaison to approve
That can take a long time, but the release management team is not really at fault there.
Could you describe where you've seen "sometimes patches can hang around for a while"? I suspect they belong in the (2) category?
I hadn't realised there was a requirement for the stable team to review stable patches. That could explain some of my experience. It could also be due to kolla being cycle-trailing, we often make releases at unusual times.
[...] I have a few questions for the release team about these reviews.
* What manual checks do you do beyond those that are currently automated?
See https://releases.openstack.org/reference/reviewer_guide.html
* Could the above checks be automated?
We aggressively automate everything that can be. Like I'm currently working to automate the check that the release was approved by the PTL or release liaison.
* What issues have you caught that were not caught by CI jobs?
It's generally semver violations, or timing issues (like requesting a release during a freeze). Sometimes it's corner cases not handled (yet) by automation, like incompatibility between the release version asked and the deliverable release model. You can look at the history of releases for examples.
Hopefully I haven't offended anyone here. There's often more involved with these things than you first suspect.
Decentralizing would be a lot of work to create new systems and processes... and I don't think we can automate everything. It's unreasonable to expect everyone to know the release process by heart and respect timing and freezes. And releases are the only thing we produce that we can't undo.
I would rather eliminate the issue by making sure release processing is back to fast. So here is my proposal:
- go back to single release manager approval
This seems like it should make a big difference - reducing the load on reviewers and the requirements for approval should reduce time in flight.
- directly approve stable releases after a cursory semver check, not waiting for stable-maint-core approval.
That should make sure all releases are processed within a couple of days, which I think is a good trade-off between retaining some releases for 10+ days and not having a chance to catch odd cases before releases at all.
Thoughts?
Thanks for the detailed response. I tend to prefer models where teams can be self-sufficient using shared tooling and policies, but I'm also missing some context and history, and don't have to clean up when things go wrong. Ultimately, you've proposed some simple changes which should improve the situation, so that's a good result in my view.
-- Thierry Carrez (ttx)
On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote:
Mark Goddard wrote:
[...] As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive).
I agree we've seen an increase in processing delay lately, and I'd like to correct that. There are generally three things that would cause a perceptible delay in release processing...
1- wait for two release managers +2
This is something we put in place some time ago, as we had a lot of new members and thought that would be a good way to onboard them. Lately it created delays as a lot of those were not as active.
2- stable releases
Two subcases in there... Eitherthe deliverable is under stable policy and there are *significant* delays there as we have to pause to give a chance to stable-maint-core people to voice an opinion. Or the deliverable is not under stable policy, but we do a manual check on the changes, as a way to educate the requester on semver.
3- waiting for PTL/release liaison to approve
That can take a long time, but the release management team is not really at fault there.
Coming back to hopefully wrap this up... We discussed this in today's release team meeting and decided to make some changes to hopefully make things a little smoother. We will now use the following guidelines for reviewing and approving release requests: For releases in the current development (including some time for the previous cycle for the release-trailing deliverables) we will only require a single reviewer. If everything looks good and there are no concerns, we will +2 and approve the release request without waiting for a second. If the reviewer has any doubts or hesitation, they can decide to wait for a second reviewer, but this should be a much less common situation. For stable releases, we will require two +2s. We will not, however, wait for a designated day for stable team review. If we can get one, all the better, but the normal release team should be aware of stable rules and look for them for any stable release request. Keeping the requirement for two reviewers should help make sure nothing is overlooked with stable policy. We do still want PTL/liaison +1 to appove, so we will continue to wait for that. Thierry is working on some job automation to make checking for that a little easier, so hopefully that will help make that process as smooth as possible. If there are any other questions or concerns, please do let us know. Sean
Hi Sean, just to verify my interpretation. This means e.g. [1] is now good to go? 2 release team members, 1 liaison and 1 PTL extra (for stable release). [1] https://review.opendev.org/701080 -yoctozepto czw., 9 sty 2020 o 18:03 Sean McGinnis <sean.mcginnis@gmx.com> napisał(a):
On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote:
Mark Goddard wrote:
[...] As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive).
I agree we've seen an increase in processing delay lately, and I'd like to correct that. There are generally three things that would cause a perceptible delay in release processing...
1- wait for two release managers +2
This is something we put in place some time ago, as we had a lot of new members and thought that would be a good way to onboard them. Lately it created delays as a lot of those were not as active.
2- stable releases
Two subcases in there... Eitherthe deliverable is under stable policy and there are *significant* delays there as we have to pause to give a chance to stable-maint-core people to voice an opinion. Or the deliverable is not under stable policy, but we do a manual check on the changes, as a way to educate the requester on semver.
3- waiting for PTL/release liaison to approve
That can take a long time, but the release management team is not really at fault there.
Coming back to hopefully wrap this up...
We discussed this in today's release team meeting and decided to make some changes to hopefully make things a little smoother. We will now use the following guidelines for reviewing and approving release requests:
For releases in the current development (including some time for the previous cycle for the release-trailing deliverables) we will only require a single reviewer. If everything looks good and there are no concerns, we will +2 and approve the release request without waiting for a second. If the reviewer has any doubts or hesitation, they can decide to wait for a second reviewer, but this should be a much less common situation.
For stable releases, we will require two +2s. We will not, however, wait for a designated day for stable team review. If we can get one, all the better, but the normal release team should be aware of stable rules and look for them for any stable release request. Keeping the requirement for two reviewers should help make sure nothing is overlooked with stable policy.
We do still want PTL/liaison +1 to appove, so we will continue to wait for that. Thierry is working on some job automation to make checking for that a little easier, so hopefully that will help make that process as smooth as possible.
If there are any other questions or concerns, please do let us know.
Sean
On Thu, Jan 09, 2020 at 06:08:53PM +0100, Radosław Piliszek wrote:
Hi Sean,
just to verify my interpretation. This means e.g. [1] is now good to go? 2 release team members, 1 liaison and 1 PTL extra (for stable release).
[1] https://review.opendev.org/701080
-yoctozepto
Correct. I will take one quick look again, and if all looks good get that one going. Sean
Thanks, Sean. -yoctozepto czw., 9 sty 2020 o 18:17 Sean McGinnis <sean.mcginnis@gmx.com> napisał(a):
On Thu, Jan 09, 2020 at 06:08:53PM +0100, Radosław Piliszek wrote:
Hi Sean,
just to verify my interpretation. This means e.g. [1] is now good to go? 2 release team members, 1 liaison and 1 PTL extra (for stable release).
[1] https://review.opendev.org/701080
-yoctozepto
Correct. I will take one quick look again, and if all looks good get that one going.
Sean
On Thu, 9 Jan 2020 at 17:01, Sean McGinnis <sean.mcginnis@gmx.com> wrote:
On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote:
Mark Goddard wrote:
[...] As kolla PTL and ironic release liaison I've proposed a number of release patches recently. Generally the release team is good at churning through these, but sometimes patches can hang around for a while. Usually a ping on IRC will get things moving again within a day or so (thanks in particular to Sean who has been very responsive).
I agree we've seen an increase in processing delay lately, and I'd like to correct that. There are generally three things that would cause a perceptible delay in release processing...
1- wait for two release managers +2
This is something we put in place some time ago, as we had a lot of new members and thought that would be a good way to onboard them. Lately it created delays as a lot of those were not as active.
2- stable releases
Two subcases in there... Eitherthe deliverable is under stable policy and there are *significant* delays there as we have to pause to give a chance to stable-maint-core people to voice an opinion. Or the deliverable is not under stable policy, but we do a manual check on the changes, as a way to educate the requester on semver.
3- waiting for PTL/release liaison to approve
That can take a long time, but the release management team is not really at fault there.
Coming back to hopefully wrap this up...
We discussed this in today's release team meeting and decided to make some changes to hopefully make things a little smoother. We will now use the following guidelines for reviewing and approving release requests:
For releases in the current development (including some time for the previous cycle for the release-trailing deliverables) we will only require a single reviewer. If everything looks good and there are no concerns, we will +2 and approve the release request without waiting for a second. If the reviewer has any doubts or hesitation, they can decide to wait for a second reviewer, but this should be a much less common situation.
For stable releases, we will require two +2s. We will not, however, wait for a designated day for stable team review. If we can get one, all the better, but the normal release team should be aware of stable rules and look for them for any stable release request. Keeping the requirement for two reviewers should help make sure nothing is overlooked with stable policy.
We do still want PTL/liaison +1 to appove, so we will continue to wait for that. Thierry is working on some job automation to make checking for that a little easier, so hopefully that will help make that process as smooth as possible.
Thanks for taking action on this - I expect the above changes will be a big improvement.
If there are any other questions or concerns, please do let us know.
Sean
participants (5)
-
Mark Goddard
-
Matt Riedemann
-
Radosław Piliszek
-
Sean McGinnis
-
Thierry Carrez