Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How long does it take the operator to sift through those 10,000 attempts to find the successful one, when it's not a contrived benchmark where the desired answer is already known ahead of time? LLMs generally don't know when they've failed, they just barrel forwards and leave the user to filter out the junk responses.



I have an idea! We should train an LLM with reasoning capabilities to sift through all the attempts! /s


why /s ? Isn't that an approach some people are actually trying to take?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: