I really hope sonnet 4 is not obsessed with tool calls the way 3-7 is. 3-5 was s...

lherron · 2025-05-22T18:43:29 1747939409

Overly aggressive “let me do one more thing while I’m here” in 3.7 really turned me off as well. Would love a return to 3.5’s adherence.

mkrd · 2025-05-22T20:53:14 1747947194

Yes, this is pretty annoying. You give it a file and want it to make a small focused change, but instead it almost touches every line of code, even the unrelated ones

artdigital · 2025-05-23T08:53:48 1747990428

Oh jeez yes. I completely forgot that this was a thing. It’s tendency to do completely different things “while at it” was ridiculous

nomel · 2025-05-22T19:25:52 1747941952

I think there was definitely a compromise with 3.7. When I turn off thinking, it seems to perform very poorly compared to 3.5.

Jabrov · 2025-05-23T09:55:59 1747994159

Can be solved with more specific prompting

deciduously · 2025-05-22T17:24:28 1747934668

This feels like more of a system prompt issue than a model issue?

waleedlatif1 · 2025-05-22T17:37:36 1747935456

imo, model regression might actually stem from more aggressive use of toolformer-style prompting, or even RLHF tuning optimizing for obedience over initiative. i bet if you ran comparable tasks across 3-5, 3-7, and 4-0 with consistent prompts and tool access disabled, the underlying model capabilities might be closer than it seems.

ketzo · 2025-05-22T19:05:14 1747940714

Anecdotal of course, but I feel a very distinct difference between 3.5 and 3.7 when swapping between them in Cursor’s Agent mode (so the system prompt stays consistent).