Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, you should check again, because it hasn't been text to text for a while. There are multimodal models now.


If we want to get pedantic, in context of practical transformer models it has never been text to text. It has always been tokens to tokens. The tokens can represent anything. "multimodal" is a marketing term.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: