On Fri, Jun 13, 2025 at 5:02 PM Allison Randal <allison@lohutok.net> wrote:

On 6/12/25 4:01 PM, Eoghan Glynn wrote:
>
> We clearly have vanishingly low invocation of the existing policy
> though, at least for the OpenStack codebases.
>
> Now that could be due to one of several reasons, including:
>
> (a) very low genAI tool usage among our contributors
>
> or:
>
> (b) widespread non-observance of the policy's requirements

or:

(c) generative AI is good at producing code when it was trained on a
million examples of similar code (like using a popular javascript
library), but not good at producing code that's rare, unusual, or in a
unique problem domain. An LLM trained on Nova code might be useful for
writing Nova code, but most generative AI tools won't be.

Possible, though TBH I'd be surprised if the models in wide use today didn't include in their training corpus most (if not all) public and permissively licensed github repos that meet some basic quality/activity threshold.

So I'd suspect the Nova code is likely in the training mix already for the models people are actually using right now ... though of course, open to correction on that.

It sounds like a good next step would be to have a conversation with
developers, rather than making assumptions about what developers want or
need.

Fair enough, happy to have a wider conversation. Of the small sample I've already talked to, seemed the current policy was seen as a bit too heavyweight. IIUC similar sentiments came up in the discussion that Julia ran a while back, FWIW.

Cheers,

Eoghan

Allison