Hacker News new | past | comments | ask | show | jobs | submit login

> So basically I, as an open source author, had my code eaten up by Mistral without my consent

Not necessarily. You consented to people reading your code and learning from it when you posted it on Github. Whether or not there's an issue with AI doing the same remains to be settled. It certainly isn't clear cut that separate consent would be required.




MIT/BSD code is fair game, but isn't the whole point of GPL/AGPL "you can read and share and use this, but you can't take it and roll it into your closed commercial product for profit"? It seems like what Mistral and co are doing is a fundamental violation of the one thing GPL is striving to enforce.


No. Either MIT/BSD code isn't fair game because it requires attribution, or GPL/AGPL code is fair game because it isn't copyright infringement in the first place so no license is required.

It'll be a court fight to determine which. Worse, it will be a court fight that plays out in a bunch of different countries and they probably won't all come to the same conclusion. It's unlikely the two licenses have a different effect here though. Either they both forbid it, or neither had the power to forbid it in the first place.


Precisely, this is such a basic violation of GPL it’s mind boggling they went for it.


Is there an updated version of these license(s) that explicitly excludes projects from being used for training of AIs?


> but isn't the whole point of GPL/AGPL "you can read and share and use this, but you can't take it and roll it into your closed commercial product for profit"?

You can profit from GPL / AGPL code but just also make all your source code open source and available for everyone to see.


> You consented to people reading your code and learning from it when you posted it on Github.

And if I never posted my code to github, but someone else did? What if someone had posted proprietary code they had no rights to to github at the same time the scraper bots were trawling it? A few years ago some Windows source code was leaked onto Github - did Microsoft consent then?


I did not give consent to train on my software and the license does not allow commercial use of it.

They have taken my code and now are dictating how I can use their derived work.

Personally I think these tools are useful, but if the data comes from the commons the model should also belong to the commons. This is just another attempt to gain private benefit from public work.

There are legal issues to be resolved, and there is an explosion of lawsuits already, but the fact pattern is simple and applies to nearly all closed-source AI companies.


Mistral is as open as they get, most others are far worse. Here you can use the model without issues, as others are saying it’s doubtful they would sue you if you were to use code generated by the model in a commercial app


This model is more restricted than Mistral and Mixtral - this is a new development from them.


Replit’s replit-code[1,2] is CC BY-SA 4.0 for the weights, Apache 2.0 for the sources. Replit has its own unpleasant history[3], but the model’s terms are good. (The model itself is not as good, but deciding whether that’s a worthwhile tradeoff is up to you. The tradeoff exists and is meaningful, is my point.)

[1] https://huggingface.co/replit/replit-code-v1-3b

[2] https://huggingface.co/replit/replit-code-v1_5-3b

[3] https://news.ycombinator.com/item?id=27424195




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: