On Thu, Jun 12, 2025 at 1:11 PM Allison Randal <allison@lohutok.net> wrote:
On 5/30/25 4:01 PM, Eoghan Glynn wrote:
Also worth noting the actual choice of models in this small sample suggests not much weight is being put on the current policy's steer towards: "exclusively using compatible licensed inputs to train the model, using a model released as open source, using a model trained exclusively on compatible licensed code".
To catch up with emerging practice, I'd suggest that we augment the formal policy with "Assisted-By" as soon as possible. FWIW, I think we should also simplify, shorten, and make the policy more permissive overall (e.g. by removing the language on the choice of model, and pushing the secondary scanning into the code review or merge gating flow). Though at least we should get the Assisted-By option codified
soon.
Agreed on simplifications and adding Assisted-By as an option.
But, this is not the time to remove the clear reminder to contributors that there is real uncertainty and risk around AI tools and whether their outputs can be used in open source software. Tagging the commits does mitigate some of the risk (it at least gives us the option to rip out that code someday if we have to), but making smart choices about which tools you use in the first place is even better.
The world is really only beginning the journey of understanding the legal implications of AI generation in the context of copyrighted works, for example:
https://www.wired.com/story/disney-universal-sue-midjourney/
Totally get the concern about the risks of future litigation. We clearly have vanishingly low invocation of the existing policy though, at least for the OpenStack codebases. Now that could be due to one of several reasons, including: (a) very low genAI tool usage among our contributors or: (b) widespread non-observance of the policy's requirements IMO both are bad outcomes for the community, since (a) cuts us off from the productivity gains accruing from using these tools, whereas (b) cuts us off from the future risk mitigation of clearly marked AI-assisted contributions. FWIW I suspect that the current wording of the policy could be a factor in either (a) or (b). and that a simplified, lightweight policy is more likely to be widely understood and observed. Cheers, Eoghan
Allison