Hacker News new | past | comments | ask | show | jobs | submit login

I really hope sonnet 4 is not obsessed with tool calls the way 3-7 is. 3-5 was sort of this magical experience where, for the first time, I felt the sense that models were going to master programming. It’s kind of been downhill from there.





Overly aggressive “let me do one more thing while I’m here” in 3.7 really turned me off as well. Would love a return to 3.5’s adherence.

Yes, this is pretty annoying. You give it a file and want it to make a small focused change, but instead it almost touches every line of code, even the unrelated ones

Oh jeez yes. I completely forgot that this was a thing. It’s tendency to do completely different things “while at it” was ridiculous

I think there was definitely a compromise with 3.7. When I turn off thinking, it seems to perform very poorly compared to 3.5.

Can be solved with more specific prompting

This feels like more of a system prompt issue than a model issue?

imo, model regression might actually stem from more aggressive use of toolformer-style prompting, or even RLHF tuning optimizing for obedience over initiative. i bet if you ran comparable tasks across 3-5, 3-7, and 4-0 with consistent prompts and tool access disabled, the underlying model capabilities might be closer than it seems.

Anecdotal of course, but I feel a very distinct difference between 3.5 and 3.7 when swapping between them in Cursor’s Agent mode (so the system prompt stays consistent).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: