On 5/30/25 4:01 PM, Eoghan Glynn wrote:
Also worth noting the actual choice of models in this small sample suggests not much weight is being put on the current policy's steer towards: "exclusively using compatible licensed inputs to train the model, using a model released as open source, using a model trained exclusively on compatible licensed code".
To catch up with emerging practice, I'd suggest that we augment the formal policy with "Assisted-By" as soon as possible. FWIW, I think we should also simplify, shorten, and make the policy more permissive overall (e.g. by removing the language on the choice of model, and pushing the secondary scanning into the code review or merge gating flow). Though at least we should get the Assisted-By option codified soon.
Agreed on simplifications and adding Assisted-By as an option. But, this is not the time to remove the clear reminder to contributors that there is real uncertainty and risk around AI tools and whether their outputs can be used in open source software. Tagging the commits does mitigate some of the risk (it at least gives us the option to rip out that code someday if we have to), but making smart choices about which tools you use in the first place is even better. The world is really only beginning the journey of understanding the legal implications of AI generation in the context of copyrighted works, for example: https://www.wired.com/story/disney-universal-sue-midjourney/ Allison