Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think any of these companies are that reductive and short-sighted to try to game the system. However, Goodhart's Law comes into play. I am sure they have their own metrics that arr much more detailed than these benchmarks, but the fact remains LLMs will be tuned according to elements that are deterministically measurable.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: