Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Two questions:

1. Does an AI "reading" source code that has been otherwise lawfully obtained infringe copyright? Is this even enforceable?

2. Why write a new license rather than just adding a rider to the AGPL? This is missing language the AGPL uses to cover usage (rather than just copying) of software.



> Does an AI "reading" source code that has been otherwise lawfully obtained infringe copyright?

To the extent that this has been decided under US law, no. AI training on legally acquired material has been deemed fair use.


At first I was going to comment how much I personally avoid the AGPL, but now you've got me thinking, technically, any LLM training off AGPL code or even GPL or similar code is very likely violating those licenses regardless of how it is worded. The GPL already makes it so you cannot translate to another programming language to circumvent the license if I remember correctly. The AGPL should have such a similar clause.


> The GPL already makes it so you cannot translate to another programming language to circumvent the license

The operative words are the last four there. GPL, and all other software licenses (copyleft or not), can only bind you as strongly as the underlying copyright law. They are providing a copyright license that grants the licensee favorable terms, but it's still fundamentally the same framework. Anything which is fair use under copyright is also going to be fair use under the GPL (and LLMs are probably transformative enough to be fair use, though that remains to be seen.)


> and LLMs are probably transformative enough to be fair use, though that remains to be seen.

Arguably, at least in the US, it has been seen. Unless someone comes up with a novel argument not already advanced in the Anthropic case about why training an AI on otherwise legally acquired material is not transformative enough to be fair use, I don't see how you could read the ruling any other way.


I think people are holding on to hope that it gets appealed. Though you're right, the gavel has already fallen; training is fair use.


If LLM training violates AGPL, it violates MIT. People focus too much on the copyleft terms of the *GPL licenses. MIT, and most permissive licenses, require attribution.

Honestly with how much focus there tends to be on *GPL in these discussions, I get the feeling that MIT style licenses tend to be the most frequently violated, because people treat it as public domain.


This is a good call out. What would it fundamentally change? MIT is a few hairs away from just publishing something under public domain is it not? There's the whole "there's no warranty or liability if this code blows up your potato" bit of the MIT, but good luck trying to reverse engineer from the LLM which project was responsible for your vibe coding a potato into exploding.


To point one: Normally, no. However, this license does not ask that question, and says that if you let an AI read it, your license to use the software is now void.


Can you actually do that in US law?


I definitely don't know enough to say either way. On the one hand, general contract law seems to say that the terms of a contract can be pretty much anything as long as it's not ambiguous or grossly unfair. On the other, some real lawyers even have doubts about the enforceability of some widely used software licenses. So I could see it going either way.


Do you think there _should_ be a legal mechanism for enforcing the kind of rules they're trying to create here? I have mixed feelings about it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: