I'm about 50kloc into a project making a react native app / golang backend for recipes with grocery lists, collaborative editing, household sharing, so a complex data model and runtime. Purely from the experiment of "what's it like to build with AI, no lines of code directly written, just directing the AI."
As I go through features, I'm comparing a matrix of Cursor, Cline, and Roo, with the various models.
While I'm still working on the final product, there's no doubt to me that Sonnet is the only model that works with these tools well enough to be Agentic (rather than single file work).
I'm really excited to now compare this 3.7 release and how good it is at avoiding some of the traps 3.5 can fall into.
"no lines of code directly written, just directing the AI"
/skeptical face.
Without fail, every. single. person. I've met who says that, actually means "except for the code that I write", or "except for how I link the code it build together by hand".
If you are 50kloc in to a large complex project that you have literally written none of, and have, eg. used cursor to generate the code without any assistance... well, you should start a startup.
...because, that's what devin was supposed to be, and it was enormously and famously terrible at it.
So that would be either a) terribly exciting, or b) hyperbole.
I’m currently doing something very similar to what GP is doing - I’m building a hobby project that’s a desktop app with web frontend. It’s a map editor with a 3D view. My estimate is that 80-90% of the code was written by AI. Sure, I did have to intervene or write some more complex parts myself but it’s still exciting to me that in many cases it took just a single prompt to add a new feature to it or change existing behavior. Judging from the complexity of the project it would take me in the past 4-5x longer if I were to write it completely by hand. It’s a game changer for me.
> My estimate is that 80-90% of the code was written by AI
Nice! It is entirely reasonable both to do that and to be excited about it.
…buuut, if that’s what you’re doing, you should say so.
Not:
“no lines of code directly written, just directing the AI”
Because those (gluing together AI code by hand and having the agent do everything) are different things, and one of them is much much MUCH harder to get right than the other one.
That last 10-15%. Self driving cars are the same story right?
I don’t think this is a fair take. For self driving cars, you care about that because safety is involved and the reliability of the AI is the product itself.
For OP, the product is the product, how they got there is mostly irrelevant. We don’t really care what IDE they used (outside of being a tooling nerd).
AI is hard; edge cases are hard. AI sucks at edge cases.
Between AI for cars and AI for software the long tail of edge cases that have to be catered for is different, yes.
...but I'm sure the same will apply for AI for art (e.g. hands), and AI for (insert domain here).
Obviously no analogy is perfect, but I think you have to really make an effort to look away from reality not to see the glaringly obvious parallels in cars, art, programming, problem solving, robots, etc. where machine learning models struggle with edge cases.
Does the tooling they used matter? no, not at all.
...but if they've claimed to solve the 'edge case problem', they've done something really interesting. If not, they haven't.
So, don't claim to have done something really interesting if you haven't.
You can say "I've been using AI to build a blah blah blah. It's great!" and that's perfectly ok.
You have to go out of your way to say "I've been using an AI to build blah blah blah and I haven't written any of it, it's all generated by AI". <-- kinda attention seeking.
"no lines of code directly written" really? Why did you mention that? You got the AI to write your software for you? That sounds cool! Let's talk! Are you an AI consultant by any chance? (yes, they are). ...but.
No. You didn't. You really didn't. I'm completely happy to call people out for doing that; its not unfair at all.
That's the point of the experiment I'm doing, what it takes to get these things to be able to generate all the code, and I'm just directing.
I literally have not written a line of code. The AI agent configures the build systems. It executes the `go install` command. It configures the infrastructure via terraform.
It takes a lot of reading of the code that's generated to see what I agree with or not, and redirecting refactorings. Understanding how to describe problem statements that are translated into design docs that are translated into task lists. It's still a lot of knowledge work on how to build software. But now I can do the coding that might have taken a day from those plans in 20 minutes.
Regarding startups, there's nothing here I'm doing that isn't just learning the tools of agentic coding. The business here might be advising people on how to do it themselves.
If you know how to architect code well, you can guide the AI to create smaller more targeted modules. That way as you 'write code with AI', you give it a targeted subset of the files to edit on each prompt.
In a way the AI becomes the dev and you become the code reviewer. Often as the AI is writing the code, you're thinking about the next step.
As I go through features, I'm comparing a matrix of Cursor, Cline, and Roo, with the various models.
While I'm still working on the final product, there's no doubt to me that Sonnet is the only model that works with these tools well enough to be Agentic (rather than single file work).
I'm really excited to now compare this 3.7 release and how good it is at avoiding some of the traps 3.5 can fall into.