On Tue, Jun 17, 2025 at 11:36 PM Julia Kreger <juliaashleykreger@gmail.com> wrote:
On Fri, Jun 13, 2025 at 10:03 AM Eoghan Glynn <eglynn@redhat.com> wrote:
On Fri, Jun 13, 2025 at 5:02 PM Allison Randal <allison@lohutok.net>
On 6/12/25 4:01 PM, Eoghan Glynn wrote:
We clearly have vanishingly low invocation of the existing policy though, at least for the OpenStack codebases.
Now that could be due to one of several reasons, including:
(a) very low genAI tool usage among our contributors
or:
(b) widespread non-observance of the policy's requirements
or:
(c) generative AI is good at producing code when it was trained on a million examples of similar code (like using a popular javascript library), but not good at producing code that's rare, unusual, or in a unique problem domain. An LLM trained on Nova code might be useful for writing Nova code, but most generative AI tools won't be.
Possible, though TBH I'd be surprised if the models in wide use today didn't include in their training corpus most (if not all) public and
wrote: permissively licensed github repos that meet some basic quality/activity threshold.
So I'd suspect the Nova code is likely in the training mix already for
the models people are actually using right now ... though of course, open to correction on that.
It sounds like a good next step would be to have a conversation with developers, rather than making assumptions about what developers want or need.
Fair enough, happy to have a wider conversation. Of the small sample
I've already talked to, seemed the current policy was seen as a bit too heavyweight. IIUC similar sentiments came up in the discussion that Julia ran a while back, FWIW.
Indeed. This is all as the tools have evolved substantially. We've gone from bad code, to slightly better code. And then even better code, where we can ask it to rewrite the suggestion with OpenStack context and even context of a project's code base. And now, we're at "productivity tool" level of capability as some tools are able to understand and carry out specific user-requested actions which are an actual productivity boost as opposed to just "write" code.
Two recent examples I'm aware of:
- "find my bug in this code, I'm experiencing it hang this way" which successfully led to some solutions to explore. - "Please restructure the code base to change the folder of this module and update all references", similar to running "sed s/cmd/command/g" across the code base and "mv cmd command". It didn't get everything right, but it was surprisingly close.
So I think the right thing is to shift our focus to expect that to be the use case, and not the bulk composition of code which we originally envisioned early on.
I don't believe we should greatly dial back the policy document, but we do need to focus our engagement because the number one piece of feedback which continues to resonate to me is that it was just too much for folks to wrap their heads around. We wrote it from the context of the board assuming a similarly leveled reader. And... turns out newer contributors are the folks reading it and their eyes glaze over around the third paragraph.
Thanks Julia, If there isn't an appetite to dial back the main policy doc, then one obvious approach would be to also provide a super-concise & informal preamble, deliberately written in very approachable language that'll avoid the eye-glaze. Something along of the lines of: "We welcome contributions crafted with the help of AI tools. Use whatever model works best for you, just check the terms and conditions first. You'll need to mark patches prepared with the help of AI. Use either the 'Generated-by' label when you got lots of help, or else 'Assisted-by' when you still wrote most of it by hand. A human must always be in the loop, both to submit and review patches. Be extra careful with security fixes. Copyright law still applies to any preexisting work. And of course, follow your employer’s rules where relevant." (not wedded to that particular formula of words, just intended as an illustration of the type of thing that might work)
While we should absolutely continue to discuss, we do have a need to revise the document based upon existing input and attempt to focus it. My plan for tomorrow morning, unless something is on fire first thing, is to send out my current working draft with comment permission enabled to enable us to further collaborate and move this item forward.
Furthermore, I think it would be good to schedule a follow-up call on this topic, perhaps next Tuesday?
Great idea! Cheers, Eoghan
Thanks,
-Julia
Cheers, Eoghan
Allison
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.org To unsubscribe send an email to foundation-board-leave@lists.openinfra.org