[nova][ptg] Ussuri scope containment
Nova developers and maintainers-
Every cycle we approve some number of blueprints and then complete a low percentage [1] of them. Which blueprints go unfinished seems to be completely random (notably, it appears to have nothing to do with our declared cycle priorities). This is especially frustrating for consumers of a feature, who (understandably) interpret blueprint/spec approval as a signal that they can reasonably expect the feature to land [2].
The cause for non-completion usually seems to fall into one of several broad categories:
== Inadequate *developer* attention == - There's not much to be done about the subset of these where the contributor actually walks away.
- The real problem is where the developer thinks they're ready for reviewers to look, but reviewers don't. Even things that seem obvious to experienced reviewers, like failing CI or "WIP" in the commit title, will cause patches to be completely ignored -- but unseasoned contributors don't necessarily understand even that, let alone more subtle issues. Consequently, patches will languish, with each side expecting the other to take the next action. This is a problem of culture: contributors don't understand nova reviewer procedures and psychology.
== Inadequate *reviewer* attention == - Upstream maintainer time is limited.
- We always seem to have low review activity until the last two or three weeks before feature freeze, when there's a frantic uptick and lots gets done.
- But there's a cultural rift here as well. Getting maintainers to care about a blueprint is hard if they don't already have a stake in it. The "squeaky wheel" concept is not well understood by unseasoned contributors. The best way to get reviews is to lurk in IRC and beg. Aside from not being intuitive, this can also be difficult logistically (time zone pain, knowing which nicks to ping and how) as well as interpersonally (how much begging is enough? too much? when is it appropriate?).
== Multi-release efforts that we knew were going to be multi-release == These may often drag on far longer than they perhaps should, but I'm not going to try to address that here.
========
There's nothing new or surprising about the above. We've tried to address these issues in various ways in the past, with varying degrees of effectiveness.
I'd like to try a couple more.
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
Thoughts?
efried
[1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of a good way to go back and mine actual data. [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we have to be able to do better than that. [3] https://blueprints.launchpad.net/nova/train [4] Recognizing of course that not all blueprints are created equal, this is more an attempt at a reasonable heuristic than an actual expectation of total size/LOC/person-hours/etc. The theory being that constraining to an actual number, whatever the number may be, is better than not constraining at all. [5] If you're a core, you can be your own liaison, because presumably you don't need further cultural indoctrination or help begging for reviews. [6] https://review.opendev.org/685857
On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried openstack@fried.cc wrote:
Nova developers and maintainers-
Every cycle we approve some number of blueprints and then complete a low percentage [1] of them. Which blueprints go unfinished seems to be completely random (notably, it appears to have nothing to do with our declared cycle priorities). This is especially frustrating for consumers of a feature, who (understandably) interpret blueprint/spec approval as a signal that they can reasonably expect the feature to land [2].
The cause for non-completion usually seems to fall into one of several broad categories:
== Inadequate *developer* attention ==
- There's not much to be done about the subset of these where the
contributor actually walks away.
- The real problem is where the developer thinks they're ready for
reviewers to look, but reviewers don't. Even things that seem obvious to experienced reviewers, like failing CI or "WIP" in the commit title, will cause patches to be completely ignored -- but unseasoned contributors don't necessarily understand even that, let alone more subtle issues. Consequently, patches will languish, with each side expecting the other to take the next action. This is a problem of culture: contributors don't understand nova reviewer procedures and psychology.
== Inadequate *reviewer* attention ==
Upstream maintainer time is limited.
We always seem to have low review activity until the last two or
three weeks before feature freeze, when there's a frantic uptick and lots gets done.
- But there's a cultural rift here as well. Getting maintainers to
care about a blueprint is hard if they don't already have a stake in it. The "squeaky wheel" concept is not well understood by unseasoned contributors. The best way to get reviews is to lurk in IRC and beg. Aside from not being intuitive, this can also be difficult logistically (time zone pain, knowing which nicks to ping and how) as well as interpersonally (how much begging is enough? too much? when is it appropriate?).
When I joined I was taught that instead of begging go and review open patches which a) helps the review load of dev team b) makes you known in the community. Both helps getting reviews on your patches. Does it always work? No. Do I like begging for review? No. Do I like to get repatedly pinged to review? No. So I would suggest not to declare that the only way to get review is to go and beg.
== Multi-release efforts that we knew were going to be multi-release
These may often drag on far longer than they perhaps should, but I'm not going to try to address that here.
========
There's nothing new or surprising about the above. We've tried to address these issues in various ways in the past, with varying degrees of effectiveness.
I'd like to try a couple more.
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
I support the ide that we limit our scope. But it is pretty hard to select which 25 (or whathever amount we agree on) bp we approve out of possible ~50ish. What will be the method of selection?
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
I proposed this before and I still think this could help. And partially answer my question above, this could be one of the way to limit the approved bps. If each core only commits to "care about" the implementation of 2 bps, then we already have a limit for the number of approved bps.
Cheers, gibi
Thoughts?
efried
[1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of a good way to go back and mine actual data. [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we have to be able to do better than that. [3] https://blueprints.launchpad.net/nova/train [4] Recognizing of course that not all blueprints are created equal, this is more an attempt at a reasonable heuristic than an actual expectation of total size/LOC/person-hours/etc. The theory being that constraining to an actual number, whatever the number may be, is better than not constraining at all. [5] If you're a core, you can be your own liaison, because presumably you don't need further cultural indoctrination or help begging for reviews. [6] https://review.opendev.org/685857
On 01/10/19 07:30 +0000, Balázs Gibizer wrote:
On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried openstack@fried.cc wrote:
Nova developers and maintainers-
Every cycle we approve some number of blueprints and then complete a low percentage [1] of them. Which blueprints go unfinished seems to be completely random (notably, it appears to have nothing to do with our declared cycle priorities). This is especially frustrating for consumers of a feature, who (understandably) interpret blueprint/spec approval as a signal that they can reasonably expect the feature to land [2].
The cause for non-completion usually seems to fall into one of several broad categories:
== Inadequate *developer* attention ==
- There's not much to be done about the subset of these where the
contributor actually walks away.
- The real problem is where the developer thinks they're ready for
reviewers to look, but reviewers don't. Even things that seem obvious to experienced reviewers, like failing CI or "WIP" in the commit title, will cause patches to be completely ignored -- but unseasoned contributors don't necessarily understand even that, let alone more subtle issues. Consequently, patches will languish, with each side expecting the other to take the next action. This is a problem of culture: contributors don't understand nova reviewer procedures and psychology.
== Inadequate *reviewer* attention ==
Upstream maintainer time is limited.
We always seem to have low review activity until the last two or
three weeks before feature freeze, when there's a frantic uptick and lots gets done.
- But there's a cultural rift here as well. Getting maintainers to
care about a blueprint is hard if they don't already have a stake in it. The "squeaky wheel" concept is not well understood by unseasoned contributors. The best way to get reviews is to lurk in IRC and beg. Aside from not being intuitive, this can also be difficult logistically (time zone pain, knowing which nicks to ping and how) as well as interpersonally (how much begging is enough? too much? when is it appropriate?).
When I joined I was taught that instead of begging go and review open patches which a) helps the review load of dev team b) makes you known in the community. Both helps getting reviews on your patches. Does it always work? No. Do I like begging for review? No. Do I like to get repatedly pinged to review? No. So I would suggest not to declare that the only way to get review is to go and beg.
+1
In projects I have worked on there is no need to encourage extra begging and squeaky wheel prioritization has IMO not been a healthy thing.
There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself.
There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight.
== Multi-release efforts that we knew were going to be multi-release
These may often drag on far longer than they perhaps should, but I'm not going to try to address that here.
========
There's nothing new or surprising about the above. We've tried to address these issues in various ways in the past, with varying degrees of effectiveness.
I'd like to try a couple more.
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
I support the ide that we limit our scope. But it is pretty hard to select which 25 (or whathever amount we agree on) bp we approve out of possible ~50ish. What will be the method of selection?
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
I proposed this before and I still think this could help. And partially answer my question above, this could be one of the way to limit the approved bps. If each core only commits to "care about" the implementation of 2 bps, then we already have a limit for the number of approved bps.
Cheers, gibi
Thoughts?
efried
[1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of a good way to go back and mine actual data. [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we have to be able to do better than that. [3] https://blueprints.launchpad.net/nova/train [4] Recognizing of course that not all blueprints are created equal, this is more an attempt at a reasonable heuristic than an actual expectation of total size/LOC/person-hours/etc. The theory being that constraining to an actual number, whatever the number may be, is better than not constraining at all. [5] If you're a core, you can be your own liaison, because presumably you don't need further cultural indoctrination or help begging for reviews. [6] https://review.opendev.org/685857
On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: [...]
In projects I have worked on there is no need to encourage extra begging and squeaky wheel prioritization has IMO not been a healthy thing.
There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself.
There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight.
[...]
Agreed, it drives back to Eric's comment about familiarity with the team's reviewer culture. Just saying "hey I pushed these patches can someone look" is often far less effective for a newcomer than "I reported a bug in subsystem X which is really aggravating and review 654321 fixes it if anyone has a moment to look" or "tbarron: I addressed your comments on review 654321 when you get a chance to revisit your -1, thanks!"
My cardinal rules of begging: Don't mention the nicks of random people who have not been involved in the change unless you happen to actually know it's one they'll personally be interested in. Provide as much context as possible (within reason) to attract the actual interest of potential reviewers. Be polite, thank people, and don't assume your change is important to anyone nor that there's someone who has time to look at it. And most important, as you noted too, if you're waiting around then take a few minutes and go review something to pass the time! ;)
Thanks for the responses, all.
This subthread is becoming tangential to my original purpose, so I'm renaming it.
The best way to get reviews is to lurk in IRC and beg.
<snip>
When I joined I was taught that instead of begging go and review open patches which a) helps the review load of dev team b) makes you known in the community. Both helps getting reviews on your patches. Does it always work? No. Do I like begging for review? No. Do I like to get repatedly pinged to review? No. So I would suggest not to declare that the only way to get review is to go and beg.
I recognize I was generalizing; begging isn't really "the best way" to get reviews. Doing reviews and becoming known (and *then* begging :) is far more effective -- but is literally impossible for many contributors. Even if they have the time (percentage of work week) to dedicate upstream, it takes massive effort and time (calendar) to get there. We can not and should not expect this of every contributor.
More...
On 10/1/19 8:00 AM, Jeremy Stanley wrote:
On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: [...]
In projects I have worked on there is no need to encourage extra begging and squeaky wheel prioritization has IMO not been a healthy thing.
There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself.
There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight.
[...]
Agreed, it drives back to Eric's comment about familiarity with the team's reviewer culture. Just saying "hey I pushed these patches can someone look" is often far less effective for a newcomer than "I reported a bug in subsystem X which is really aggravating and review 654321 fixes it if anyone has a moment to look" or "tbarron: I addressed your comments on review 654321 when you get a chance to revisit your -1, thanks!"
My cardinal rules of begging: Don't mention the nicks of random people who have not been involved in the change unless you happen to actually know it's one they'll personally be interested in. Provide as much context as possible (within reason) to attract the actual interest of potential reviewers. Be polite, thank people, and don't assume your change is important to anyone nor that there's someone who has time to look at it. And most important, as you noted too, if you're waiting around then take a few minutes and go review something to pass the time! ;)
This is *precisely* the kind of culture that we cannot expect inexperienced contributors to understand. We can write it down [1], but then we have to get people to read what's written.
To tie back to the original thread, this is where it would help to have a core (or experienced dev) as a mentor/liaison to be the first point of contact for questions, guidance, etc. Putting it in the spec process ensures it doesn't get missed (like a doc sitting "out there" somewhere).
efried
[1] though I fear that would end up being a long-winded and wandering tome, difficult to read and grok, assuming we could even agree on what it should say (frankly, there are some aspects we should be embarrassed to admit in writing)
On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried openstack@fried.cc wrote:
Thanks for the responses, all.
This subthread is becoming tangential to my original purpose, so I'm renaming it.
The best way to get reviews is to lurk in IRC and beg.
<snip> > When I joined I was taught that instead of begging go and review > open > patches which a) helps the review load of dev team b) makes you > known > in the community. Both helps getting reviews on your patches. Does > it > always work? No. Do I like begging for review? No. Do I like to get > repatedly pinged to review? No. So I would suggest not to declare > that > the only way to get review is to go and beg.
I recognize I was generalizing; begging isn't really "the best way" to get reviews. Doing reviews and becoming known (and *then* begging :) is far more effective -- but is literally impossible for many contributors. Even if they have the time (percentage of work week) to dedicate upstream, it takes massive effort and time (calendar) to get there. We can not and should not expect this of every contributor.
Sure, it is not easy for a new commer to read a random nova patch. But I think we should encourage them to do so. As that is one of the way how a newcomer will learn how nova (as software) works. I don't expect from a newcommer to point out in a nova review that I made a mistake about an obscure nova specific construct. But I think a newcommer still can give us valuable feedback about the code readability, about generic python usage, about English grammar...
gibi
---- On Tue, 01 Oct 2019 11:19:44 -0500 Balázs Gibizer balazs.gibizer@est.tech wrote ----
On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried openstack@fried.cc wrote:
Thanks for the responses, all.
This subthread is becoming tangential to my original purpose, so I'm renaming it. (A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had do ne more. If that happens, drinks are on me, and we can bump the number for V.
I like the idea here and be more practical than theoretical ways to handle such situation especially in Nova case. If the operator complains about less accepted BP then, we can ask them to invest developers in upstream which can avoid such cap.
But my question is same as gibi, what will be the selection criteria (when we have a large number of ready specs)?
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
+100 for this. I am sure this way we can burn more approved BP.
-gmann
The best way to get reviews is to lurk in IRC and beg.
<snip> > When I joined I was taught that instead of begging go and review > open > patches which a) helps the review load of dev team b) makes you > known > in the community. Both helps getting reviews on your patches. Does > it > always work? No. Do I like begging for review? No. Do I like to get > repatedly pinged to review? No. So I would suggest not to declare > that > the only way to get review is to go and beg.
I recognize I was generalizing; begging isn't really "the best way" to get reviews. Doing reviews and becoming known (and *then* begging :) is far more effective -- but is literally impossible for many contributors. Even if they have the time (percentage of work week) to dedicate upstream, it takes massive effort and time (calendar) to get there. We can not and should not expect this of every contributor.
Sure, it is not easy for a new commer to read a random nova patch. But I think we should encourage them to do so. As that is one of the way how a newcomer will learn how nova (as software) works. I don't expect from a newcommer to point out in a nova review that I made a mistake about an obscure nova specific construct. But I think a newcommer still can give us valuable feedback about the code readability, about generic python usage, about English grammar...
gibi
On 10/1/2019 7:38 AM, Tom Barron wrote:
There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself.
There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight.
This is, in large part, why we started doing the runways stuff a few cycles ago so that people wouldn't have to beg when they had blueprint work that was ready to be reviewed, meaning there was mergeable code, i.e. not large chunks of it still in WIP status or untested. It also created a timed queue of blueprints to focus on in a two week window. However, it's not part of everyone's daily review process nor does something being in a runway queue make more than one core care about it, so it's not perfect.
Related to the sponsors idea elsewhere in this thread, I do believe that since we've expanded the entire core team to be able to approve specs, people that are +2 on a spec should be expected to be willing to help in reviewing the resulting blueprint code that comes out of it, but that doesn't always happen. I'm sure I'm guilty of that as well, but in my defense I will say I know I've approved at least more than one spec I don't personally care about but have felt pressured to approve it just to stop getting asked to review it, i.e. the squeaky wheel thing.
Thanks all for the feedback, refinements, suggestions. Please keep them coming!
If each core only commits to "care about" the implementation of 2 bps, then we already have a limit for the number of approved bps.
I'd like to not try to prescribe this level of detail. [all blueprints are not created equal] x [all cores are not created equal] = [too many variables]. Different cores will have different amounts of time, effort, and willingness to be liaisons.
I support the ide that we limit our scope. But it is pretty hard to select which 25 (or whathever amount we agree on) bp we approve out of possible ~50ish. What will be the method of selection?
Basically have a meeting and decide what should fall above or below the line, like you would in a corporate setting. It's not vastly different than how we already decide whether to approve a blueprint; it's just based on resource rather than technical criteria.
(It's a hard thing to have to tell somebody their feature is denied despite having technical merit, but my main driver here is that they would rather know that up front than it be basically a coin toss whose result they don't know until feature freeze.)
So if out of 50 blueprints, say 5 are incomplete due to lack of reviewers attention, 5 due to lack of developer attention, and 15 fail due to reviewers also being developers and having to make a hard choice... Targeting 30-35 might be better (expecting 5-10 of them to fail anyway, and not due to constrained resources).
Yup, your math makes sense. It pains me to phrase it this way, but it's more realistic:
(A) Let's aim to complete 25 blueprints in Ussuri; so we'll approve 30, expecting 5 to fail.
And the goal of this manifesto is to ensure that ~zero of the 5 incompletes are due to (A) overcommitment and (B) cultural disconnects.
The other comment I have is that I suspect all blueprints do not have the same weight, so assigning them complexity points could help avoid under/overshooting.
Yeah, that's a legit suggestion, but I really didn't want to go there [1]. I want to try to keep this conceptually as simple as possible, at least the first time around. (I also really don't see the team trying to subvert the process by e.g. making sure we pick the 30 biggest blueprints.)
efried
[1] I have long-lasting scars from my experiences with "story points" and other "agile" planning techniques.
Eric Fried wrote:
[...] There's nothing new or surprising about the above. We've tried to address these issues in various ways in the past, with varying degrees of effectiveness.
I'd like to try a couple more.
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
Thoughts?
Setting expectations more reasonably is key to grow a healthy long-term environment, so I completely support your efforts here. However I suspect there will always be blueprints that fail to be completed. If it were purely a question of reviewer resources, then I agree that capping the number of blueprints to the reviewer team's throughput is the right approach. But as you noted, incomplete blueprints come from a few different reasons, sometimes not related to reviewers efforts at all.
So if out of 50 blueprints, say 5 are incomplete due to lack of reviewers attention, 5 due to lack of developer attention, and 15 fail due to reviewers also being developers and having to make a hard choice... Targeting 30-35 might be better (expecting 5-10 of them to fail anyway, and not due to constrained resources).
The other comment I have is that I suspect all blueprints do not have the same weight, so assigning them complexity points could help avoid under/overshooting.
On 9/30/2019 6:09 PM, Eric Fried wrote:
Every cycle we approve some number of blueprints and then complete a low percentage [1] of them.
[1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of a good way to go back and mine actual data.
When Mel and I were PTLs we tracked and reported post-release numbers on blueprint activity, what was proposed, what was approved and what was completed:
Ocata:
http://lists.openstack.org/pipermail/openstack-dev/2017-February/111639.html
Pike:
http://lists.openstack.org/pipermail/openstack-dev/2017-September/121875.htm...
Queens:
http://lists.openstack.org/pipermail/openstack-dev/2018-February/127402.html
Rocky:
http://lists.openstack.org/pipermail/openstack-dev/2018-August/133342.html
Stein:
http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004234.htm...
So there are numbers in there for calculating completion percentage over the last 5 releases before Train. Of course the size of the core team and diversity of contributors over that time has changed drastically so it's not comparing apples to apples. But you said you weren't aware of data to mine so I'm giving you an axe and shovel.
So there are numbers in there for calculating completion percentage over the last 5 releases before Train. Of course the size of the core team and diversity of contributors over that time has changed drastically so it's not comparing apples to apples. But you said you weren't aware of data to mine so I'm giving you an axe and shovel.
Perhaps drastic over the last five, but not over the last three, IMHO. Some change, but not enough to account for going from 59 completed in Rocky to 25 in Train. Not all blueprints are the same size, nor require the same amount of effort on the part of any of the parties involved. Involvement ebbs and flows with other commitments, like downstream release timelines. Comparing numbers across many releases makes some sense to me, but I would definitely not think that saying "we completed 25 in T, so we will only approve 25 in U" is reasonable.
(B) Require a core to commit to "caring about" a spec before we approve it. The point of this "core liaison" is to act as a mentor to mitigate the cultural issues noted above [5], and to be a first point of contact for reviews. I've proposed this to the spec template here [6].
As I'm sure you know, we've tried the "core sponsor" thing before. I don't really think it's a bad idea, but it does have a history of not solving the problem like you might think. Constraining cores to not committing to a ton of things may help (although you'll end up with fewer things actually approved if you do that).
--Dan
When Mel and I were PTLs we tracked and reported post-release numbers
on blueprint activity, what was proposed, what was approved and what was completed
Thanks Matt. I realized too late in Train that these weren't numbers I would be able to go back and collect after the fact (at least not without a great deal of manual effort) because a blueprint "disappears" from the release once we defer it.
Best approximation: The specs directory for Train contains 37 approved specs. I count five completed specless blueprints in Train. So best case (assuming there were no deferred specless blueprints) that's 25/42=60%.
Combining with Matt & Mel's data:
Newton: 64% Ocata: 67% Pike: 72% Queens: 79% Rocky: 82% Stein: 59% Train: 60%
The obvious trend is that new PTLs produce low completion percentages, and Matt would have hit 100% by V if only he hadn't quit :P
But seriously...
Perhaps drastic over the last five, but not over the last three, IMHO. Some change, but not enough to account for going from 59 completed in Rocky to 25 in Train.
Extraction of placement and departure of Jay are drastic, IMHO. But this is just the kind of thing I really wanted to avoid attempting to quantify -- see below.
I would definitely not think that saying "we completed 25 in T, so we will only approve 25 in U" is reasonable.
I agree it's an extremely primitive heuristic. It was a stab at having a cap (as opposed to *not* having a cap) without attempting to account for all the factors, an impossible ask.
I'd love to discuss suggestions for other numbers, or other concrete mechanisms for saying "no" for reasons of resource rather than technical merit. My bid (as of [1]) is 30 approved, shooting for 25 completed (83%, approx the peak of the above numbers). Go.
efried
[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009860.h...
Extraction of placement and departure of Jay are drastic, IMHO. But this is just the kind of thing I really wanted to avoid attempting to quantify -- see below.
I'm pretty sure Jay wasn't doing 60% of the reviews in Nova, justifying an equivalent drop in our available throughput. Further, I thought splitting out placement was supposed to *reduce* the load on the nova core team? If anything that was a time sink that is now finished, placement is off soaring on its own merits and we have a bunch of resource back as a result, no?
I'd love to discuss suggestions for other numbers, or other concrete mechanisms for saying "no" for reasons of resource rather than technical merit. My bid (as of [1]) is 30 approved, shooting for 25 completed (83%, approx the peak of the above numbers). Go.
How about approved specs require a majority (or some larger-than-two number) of the cores to +2 it to indicate "yes we should do this, and yes we should do it this cycle"? Some might argue that this unfairly weight efforts that have a lot of cores interested in seeing them land, instead of the actual requisite two, but it sounds like that's what you're shooting for?
--Dan
I'm pretty sure Jay wasn't doing 60% of the reviews in Nova
Clearly not what I was implying.
splitting out placement was supposed to *reduce* the load on the nova core team?
In a sense, that's exactly what I'm suggesting - but it took a couple releases (those releases) to get there. Both the effort to do the extraction and the overlap between the placement and nova teams during that time frame pulled resource away from nova itself.
If anything that was a time sink that is now finished, placement is off soaring on its own merits and we have a bunch of resource back as a result, no?
Okay, I can buy that. Care to put a number on it?
How about approved specs require a majority (or some larger-than-two number) of the cores to +2 it to indicate "yes we should do this, and yes we should do it this cycle"? Some might argue that this unfairly weight efforts that have a lot of cores interested in seeing them land, instead of the actual requisite two, but it sounds like that's what you're shooting for?
I think the "core sponsor" thing will have this effect: if you can't get a core to sponsor your blueprint, it's a signal that "we" don't think it should be done (this cycle).
I like the >2-core idea, though the real difference would be asking for cores to consider "should we do this *in this cycle*" when they +2 a spec. Which is good and valid, but (I think) difficult to explain/track/quantify/validate. And it's asking each core to have some sense of the "big picture" (understand the scope of all/most of the candidates) which is very difficult.
since we've expanded the entire core team to be able to approve specs, people that are +2 on a spec should be expected to be willing to help in reviewing the resulting blueprint code that comes out of it, but that doesn't always happen.
Agree. I considered trying to enforce that spec and/or blueprint approvers are implicitly signing up to "care about" those specs/blueprints, but I assumed that would result in a drastic reduction in willingness to be an approver :P
Which I suppose would serve to reduce the number of approved blueprints in the cycle... Hm....
efried .
On Wed, Oct 2, 2019 at 11:24 PM Eric Fried openstack@fried.cc wrote:
I'm pretty sure Jay wasn't doing 60% of the reviews in Nova
Clearly not what I was implying.
splitting out placement was supposed to *reduce* the load on the nova core team?
In a sense, that's exactly what I'm suggesting - but it took a couple releases (those releases) to get there. Both the effort to do the extraction and the overlap between the placement and nova teams during that time frame pulled resource away from nova itself.
If anything that was a time sink that is now finished, placement is off soaring on its own merits and we have a bunch of resource back as a result, no?
Okay, I can buy that. Care to put a number on it?
How about approved specs require a majority (or some larger-than-two number) of the cores to +2 it to indicate "yes we should do this, and yes we should do it this cycle"? Some might argue that this unfairly weight efforts that have a lot of cores interested in seeing them land, instead of the actual requisite two, but it sounds like that's what you're shooting for?
I think the "core sponsor" thing will have this effect: if you can't get a core to sponsor your blueprint, it's a signal that "we" don't think it should be done (this cycle).
I like the >2-core idea, though the real difference would be asking for cores to consider "should we do this *in this cycle*" when they +2 a spec. Which is good and valid, but (I think) difficult to explain/track/quantify/validate. And it's asking each core to have some sense of the "big picture" (understand the scope of all/most of the candidates) which is very difficult.
since we've expanded the entire core team to be able to approve specs, people that are +2 on a spec should be expected to be willing to help in reviewing the resulting blueprint code that comes out of it, but that doesn't always happen.
Agree. I considered trying to enforce that spec and/or blueprint approvers are implicitly signing up to "care about" those specs/blueprints, but I assumed that would result in a drastic reduction in willingness to be an approver :P
Actually, that sounds a very reasonable suggestion from Matt. If you do care reviewing a spec, that also means you do care reviewing the implementation side. Of course, things can happen meanwhile and you can be dragged on "other stuff" (claim what you want) so you won't have time to commit on the implementation review ASAP, but your interest is still fully there. On other ways, it's a reasonable assumption to consider that cores approving a spec somehow have the responsibility to move forward with the implementation and can consequently be gently pinged for begging reviews.
Which I suppose would serve to reduce the number of approved blueprints
in the cycle... Hm....
That's just the reflect of the reality IMHO.
efried
.
(B) After some very productive discussion in the nova meeting and IRC channel this morning, I have updated the nova-specs patch introducing the "Core Liaison" concept [1]. The main change is a drastic edit of the README to include a "Core Liaison FAQ". Other changes of note:
* We're now going to make distinct use of the launchpad blueprint's "Definition" and "Direction" fields. As such, we can still decide to defer a blueprint whose spec is merged in the 'approved' directory. (Which really isn't different than what we were doing before; it's just that now we can do it for reasons other than "oops, this didn't get finished in time".) * The single-core-approval rule for previously approved specifications is removed.
(A) Note that the idea of capping the number of specs is (mostly) unrelated, and we still haven't closed on it. I feel like we've agreed to have a targeted discussion around spec freeze time where we decide whether to defer features for resource reasons. That would be a new (and good, IMO) thing. But it's still TBD whether "30 approved for 25 completed" will apply, and/or what criteria would be used to decide what gets cut.
Collected odds and ends from elsewhere in this thread:
If you do care reviewing a spec, that also means you do care reviewing the implementation side.
I agree that would be nice, and I'd like to make it happen, but separately from what's already being discussed. I added a TODO in the spec README [2].
If we end up with bags of "spare time", there's loads of tech-debt items, performance (it's a feature, let's recall) issues, and meaningful clean-ups waiting to be tackled.
Hear hear.
Viewing this from outside, 25 specs in a cycle already sounds like planning to get a *lot* done... that's completing an average of one Nova spec per week (even when averaged through the freeze weeks). Maybe as a goal it's undershooting a bit, but it's still a very impressive quantity to be able to consistently accomplish. Many thanks and congratulations to all the folks who work so hard to make this happen in Nova, cycle after cycle.
That perspective literally hadn't occurred to me from here with my face mashed up against the trees [3]. Thanks fungi.
Note that having that "big picture" is I think the main reason why historically, until very recently, there was a subgroup of the nova core team that was the specs core team, because what was approved in specs could have wide impacts to nova and thus knowing the big picture was important.
Good point, Matt. (Not that I think we should, or could, go back to that...)
efried
[1] https://review.opendev.org/#/c/685857 [2] https://review.opendev.org/#/c/685857/4/README.rst@219 [3] For non-native speakers, this is a reference to the following idiom: https://www.dictionary.com/browse/can-t-see-the-forest-for-the-trees
Update:
the nova-specs patch introducing the "Core Liaison" concept [1].
This is merged (it's now called "Feature Liaison"). Here's the new spec template section [2] and the FAQ [3]. Thanks to those who helped shape it.
(A) Note that the idea of capping the number of specs is (mostly) unrelated, and we still haven't closed on it. I feel like we've agreed to have a targeted discussion around spec freeze time where we decide whether to defer features for resource reasons. That would be a new (and good, IMO) thing. But it's still TBD whether "30 approved for 25 completed" will apply, and/or what criteria would be used to decide what gets cut.
Nothing new here.
efried
[2] http://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/uss... [3] http://specs.openstack.org/openstack/nova-specs/readme.html#feature-liaison-...
On 10/2/2019 4:18 PM, Eric Fried wrote:
I like the >2-core idea, though the real difference would be asking for cores to consider "should we do this*in this cycle*" when they +2 a spec. Which is good and valid, but (I think) difficult to explain/track/quantify/validate. And it's asking each core to have some sense of the "big picture" (understand the scope of all/most of the candidates) which is very difficult.
Note that having that "big picture" is I think the main reason why historically, until very recently, there was a subgroup of the nova core team that was the specs core team, because what was approved in specs could have wide impacts to nova and thus knowing the big picture was important. I know that not all specs are the same complexity and we changed how the core team works for specs for good reasons, but given the years of "why aren't they the same core team? it's not fair." I wanted to point out it can be, as you said, very difficult to be a specs core for different reasons from a nova core.
On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote:
Nova developers and maintainers-
[...]
I'd like to try a couple more.
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
I welcome scope reduction, focusing on fewer features, stability, and bug fixes than "more gadgetries and gongs". Which also means: less frenzy, less split attention, fewer mistakes, more retained concentration, and more serenity. And, yeah, any reasonable person would read '25' as _an_ educated limit, rather than some "optimal limit".
If we end up with bags of "spare time", there's loads of tech-debt items, performance (it's a feature, let's recall) issues, and meaningful clean-ups waiting to be tackled.
[...]
On 2019-10-03 12:10:54 +0200 (+0200), Kashyap Chamarthy wrote:
On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote:
[...]
(A) Constrain scope, drastically. We marked 25 blueprints complete in Train [3]. Since there has been no change to the core team, let's limit Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the worst thing that happens? We finish everything, early, and wish we had done more. If that happens, drinks are on me, and we can bump the number for V.
I welcome scope reduction, focusing on fewer features, stability, and bug fixes than "more gadgetries and gongs". Which also means: less frenzy, less split attention, fewer mistakes, more retained concentration, and more serenity. And, yeah, any reasonable person would read '25' as _an_ educated limit, rather than some "optimal limit".
[...]
Viewing this from outside, 25 specs in a cycle already sounds like planning to get a *lot* done... that's completing an average of one Nova spec per week (even when averaged through the freeze weeks). Maybe as a goal it's undershooting a bit, but it's still a very impressive quantity to be able to consistently accomplish. Many thanks and congratulations to all the folks who work so hard to make this happen in Nova, cycle after cycle.
On Thu, 3 Oct 2019, Kashyap Chamarthy wrote:
I welcome scope reduction, focusing on fewer features, stability, and bug fixes than "more gadgetries and gongs". Which also means: less frenzy, less split attention, fewer mistakes, more retained concentration, and more serenity. And, yeah, any reasonable person would read '25' as _an_ educated limit, rather than some "optimal limit".
If we end up with bags of "spare time", there's loads of tech-debt items, performance (it's a feature, let's recall) issues, and meaningful clean-ups waiting to be tackled.
Since I quoted the above text and referred back to this entire thread in it, I thought I better:
a) say "here here" (or is "hear hear"?) to the above 2. link to https://anticdent.org/fix-your-debt-placement-performance-summary.html which has more to say and an example of what you can get with "retained concentration"
participants (11)
-
Balázs Gibizer
-
Chris Dent
-
Dan Smith
-
Eric Fried
-
Ghanshyam Mann
-
Jeremy Stanley
-
Kashyap Chamarthy
-
Matt Riedemann
-
Sylvain Bauza
-
Thierry Carrez
-
Tom Barron