Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Machine learning as it is needs human data and input to progress further.

Synthetic data can be useful until a certain point, but you can’t expect to have a better model on synthetic data alone indefinitely.

The moat of GDM here is YouTube. That have a bazillion of gameplay and whatever videos. But here it is.

The downside I can see is that most people will stop to publish content online for free since this companies have absolutely no respect whatsoever for the humans that created the data they use.



I've never understood this argument... The real world is an unbounded training set that its cheap to observe with readily available sensors that have existed for almost a century.


Charging for content means nothing. Meta was pirating media and training against that and I suspect everyone else is too but hasn’t been caught yet.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: