Hacker News new | past | comments | ask | show | jobs | submit login

Copyright already worries about this sort of thing a great deal, and it's actually a lot more well thought-out than your average hacker is aware of. There are no hard and fast rules; but generally... the thing being sued over has to be creative enough to be copyrightable in the first place. Small snippets do not qualify for copyright protection alone.



I'm not sure this is true. At least for copyright in the common law meaning.

Oracle got copyright on API signatures…

In civil law there is a bar to protection if the work lacks "substantial" creativity. But even this bar is extremely low. More or less everything besides maybe simple math formulas is protected.


Oracle got a very thin copyright on API signatures. The "programmer convenience" ruling in Google v. Oracle basically precludes almost all copyright action on APIs alone.


No, they got absolute copyright on the API signatures.

The court did not even question any copyright, it just assumed the APIs are copyrighted by Oracle. Than it looked for reasons why copying the APIs could possibly be fair use…

By the skin of their teeth they found some very involved and case specific reasons why Google's use of the copyrighted APIs was, after all, fair use.

https://www.bhfs.com/insights/alerts-articles/2021/supreme-c...


The reason why SCOTUS bent over backwards to not talk about copyrightability was not because they assumed it was true for APIs, but because they didn't feel like they had all the facts. They basically said "we don't know if it's copyrightable, but if it is, here's a ruling that makes this case and anything similar to it go away".

Oracle only has copyright over APIs in the Federal Circuit, because they were able to hoodwink the judge into applying patent logic[0] to a copyright case. In other circuits it's still up in the air. And in the Ninth Circuit[1] there's already loads of controlling precedent that would have resulted in Oracle's case being summarily dismissed, API copyright or no.

The term "thin copyright" is a term of art. It refers to the kind of copyright protection you get from combining uncopyrightable elements in a creative way. For example, you can't own a particular chord progression. But, if you combine that with, say, a particular instrument, some audio engineering techniques, the subject matter of the lyrics, and so on... then you start getting something that requires creative effort and thus is copyrightable. Courts still have to take this into account when ruling on copyright claims as they do not want to give people a monopoly over just the chord, or just that instrument, etc.

In the case of APIs, we're talking about a series of names, plus an arrangement of type signatures that go with them. Very much a thin copyright, as the legal profession in the US calls it.

And when you have thin copyright, courts are going to be more liberal with handing out fair use exceptions. The "programmer convenience" argument that SCOTUS adopted means that copying an API to put in a different platform is OK. The Ninth Circuit says that copying an API to reimplement a platform that other people's code relies upon is also OK. There's very little room left to actually make a copyright claim on an API alone.

In the case of Copilot, it's not merely copying APIs and filling them out with novel details. It is either generating wholly novel code, or regurgitating training data, the latter of which is just a regular 'ol infringement claim with no difficult legal questions to worry about.

[0] The Court of Appeals for the Federal Circuit is the only court with subject-matter jurisdiction over patent claims. When you're the only person who can make hammers, everything looks like a nail.

[1] The Ninth Circuit court of appeals has jurisdiction over California, which means it takes on the brunt of copyright cases.


I still don't buy the part that there is not much to worry.

The thing you call "thin copyright" is still copyright. Being protected or not is in the end a binary judgment: If your stuff is "a little bit" protected it is actually fully protected—with all consequences that follow from that.

Also, alone the "assumption" of the highest US court that APIs are protected is a very strong signal. They could just have ruled that there is no protection at all; case closed. But they preferred to go for a weasel solution. This has reasons… They deliberately didn't open up the door for API freedom. (Most likely to still be able wield that weapon against foreign concurrency should they feel like that some day).

The point is: IP law is completely crazy. The smallest brain-farts are routinely protected.

The exceptions to this rule are actually stronger in civil law, but still even in the EU single words or sub-second audio samples are protected by default. (Regarding APIs the situation is better though: It's legal to reverse engineer something for e.g. compatibility, and a few other reasons; but that are explicit exceptions. The default is that almost every expression of even the slightest form of human "creativity" is copyrighted; the bar is extremely low; and gets actually pushed constantly lower and lower by common law influence).

So on both sides of the Atlantic the default is that every single line of code is protected. There is nothing like a lower bound in size. Than, form there, you could try to argue that there should be an exception from this protection in some particular case, e.g. there was no "creativity" at all involved. But you will need to win a—often very hard, expensive, and ridiculously long—fight over that issue, and wining that is nothing like a sure thing; the default is that just everything is protected to the max. (Just have a look at all the craziness around news headlines in the EU; Google lost that case back than; to understand this better, as this may be very surprising to US people: civil law does not recognize anything like "fair use"; there are exceptions of copyright protection that have in the end almost the same effect, like grants for libraries or educational purposes, but those exceptions, and their limitations, are listed explicitly in the law; if no exception is listed there just isn't one, and only the very vague "creativity bar" remains).

Regarding Copilot: It makes not much difference whether this machine spits out some verbatim copies of (clearly copyrighted!) snippets or some "remix" thereof. There is no "novel" code if at best all what this machine does is creating "remixes" of the code it has in its database based on the query given. (Its "knowledge base" is nothing else than a very funky database; technical details regarding the actual implementation of that database or its query system should not matter legally).

Before this comes up again: No, any comparisons to how humans learn are irrelevant in this consideration. That machine is not a human. It's a machine. End of story. So even if you consider also a human brain a kind of "funky database" this makes no difference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: