Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not a bug. It's the reality of token generation. It's bottlenecked by memory bandwidth.

Please publish your own benchmarks proving me wrong.



I cannot reproduce your bug on AMD. I'm going to have to conclude this is a vendor issue.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: