Yet we all use web browsers that copy copyrighted text from buffer to buffer all...

heavyset_go · on Nov 3, 2022

A browser isn't a amalgamation of billions of pieces of other works. A browser executes and renders code it's served.

Copilot's corpus is quite literally tomes of copyrighted work that are encoded and compressed in its neural network, from which it launders that work to create similar works. Copilot itself, the neutral network, is that corpus of encoded and compressed information, you can't separate the two. Copilot stores and distributes that work without any input from rightsholders, and it does it for profit.

A better analogy would be between a browser and a file server filled with copyrighted movies whose operator charges $10/mo for access. The browser is just a browser in this analogy, where the file server is the corpus that forms Copilot itself.

golemotron · on Nov 10, 2022

I don't think "compressed" is a good way to think about this because the process is lossy. The model can't be "decompressed."

If you think this way, hashing is a copyright violation.

ginsider_oaks · on Nov 3, 2022

the actual copying isn't a problem, it's distribution. if i buy access to a PDF i'm not going to get in trouble for duplicating the file unless i send it to someone else.

when someone uploads their copyrighted text to a web page they are distributing it to whoever visits that page. the browser is just the medium.

golemotron · on Nov 3, 2022

Is that the legal standard in copyright cases?