I use LLMs daily for stuff like:
- solving tasks that just require applying knowledge ("here's a paste of my python import structure. I don't write Python often and I'm aware I'm doing something wrong here because I get this error, tell me the proper way organise the package").
- writing self-contained throwaway pieces of code ("here's a paste of my DESCRIBE TABLE output, write an SQL query to show the median [...]").
- as a debugging partner ("I can SSH to this host directly, but Ansible fails to connect with this error, what could be causing this difference").
All these use cases work great, I save a lot of time. But with the core work of writing the code that I work on, I've almost never had any success. I've tried:
- Cursor (can't remember which model, the default)
- Google's Jules
- OpenAI Codex with o4
I found in all cases that the underlying capability is clearly there (the model can understand and write code) but the end-to-end value is not at all. It could write code that _worked_, but trying to get it to generate code that I am willing to maintain and "put my name on" took longer than writing the code would have.
I had to micromanage them infinitely ("be sure to rerun the formatter, make sure all tests pass" and "please follow the coding style of the repository". "You've added irrelevant comments remove those". "You've refactored most of the file but forgot a single function"). It would take many many iterations on trivial issues, and because these iterations are slow that just meant I had to context switch a lot, which is also exhausting.
Basically it was like having an intern who has successfully learned the core skill of programming but is not really capable of good collaboration and needs to be babysat all the time.
I asked friends who are enthusiastic vibe coders and they basically said "your standards are too high".
Is the model for success here that you just say "I don't care about code quality because I don't have to maintain it because I will use LLMs for that too?" Am I just not using the tools correctly?
Those who can’t stop raving about how much of a superpower LLMs are for coding, how it’s made them 100x more productive, and is unlocking things they could’ve never done before.
And those who, like you, find it to be an extremely finicky process that requires extreme amount of coddling to get average results at best.
The only thing I don’t understand is why people from the former group aren’t all utterly dominating the market and obliterating their competitors with their revolutionary products and blazing fast iteration speed.