Maybe I'm weird for doing this but I always test the models like this to gauge its confidence. Like you just showed a lot of times it'll just say whatever it "thinks" will satisfy the prompt.
People see that these things generate code and due to their lack of understanding they automatically assume this is all software engineering is.
Then we have the current batch of YC execs heavily pushing "vibe coded" startups. The sad reality is that this strategy will probably work because all they need is the next incredulous business guy to buy the vibe coded startup. There's so much money in the AI space to the point where I fully believe you can likely make billions of dollars this way through acquisition (see OAI buying Windsurf for billions of dollars, likely to devalue Cursor's also absurd valuation).
I'm not a luddite. I'm a huge fan of companies spending a decent chunk of money on R&D on innovative new projects even when there's a high risk of failure. The current LLM hype is not just an R&D project anymore. This is now being pushed as a full on replacement of human labor when it's clearly not ready. And now we're valuing AI startups at billions of dollars and planning to spend $500B on AI infrastructure so that we can generate more ghibli memes.
At some point this has to stop but I'm afraid by that point the damage will already be done. Even worse, the idiots who led this exercise in massive waste will just hop onto the next hype train.
I prefer to think of Haskell-like lazy evaluation as constructing a dataflow graph. The expression `map f (sort xs)` constructs a dataflow graph that streams each output of the sort function to `f`, and then printing the result begins running that job. Through that lens, the Haskell program is more like constructing a Spark pipeline. But you can also think of it as just sorting a list then transforming each element with a function. It only makes a difference in resource costs or when there's potential nontermination involved, unless you use unsafe effects (e.g.: unsafePerformIO).
Is there a way to think of proofs as being lazy? Yes, but it's not what you think. It's an idea in proof theory called polarization. Some parts of a proof can be called positive or negative. Positive parts roughly correspond to strict, and negative roughly correspond to lazy.
To explain a bit more: Suppose you want to prove all chess games terminate. You start by proving "There is no move in chess that increases the number of pieces on the board." This is a lemma with type `forall m: ChessMove, forall b: BoardState, numPieces b >= numPieces (applyMove m b)`. Suppose you now want to prove that, throughout a game of chess, the amount of material is decreasing. You would do this by inducting over the first lemma, which is essentially the same as using it in a recursive function that takes in a board state and a series of moves, and outputs a proof that the final state does not have more material than the initial state. This is compact, but intrinsically computational. But now you can imagine unrolling that recursive function and getting a different proof that the amount of material is always decreasing: simply write out every possible chess game and check. This is called "cut elimination."
So you can see there's a sense in which every component of a proof is "executable," and you can see whether it executes in a strict or lazy manner. Implications ("If A, then B") are lazy. Conjuctions ("A and B") can be either strict or lazy, depending on how they're used. I'm at the edge of my depth here and can't explain more -- in honesty, I never truly grokked proof polarization.
Conversely, in programming languages, it's not strictly accurate to say that the C program is strict and the Haskell program is lazy. In C, function definitions and macro expansions are lazy. You can have the BAR() macro create a #error, and yet FOO(BAR()) need not create a compile error. In Haskell, bang patterns, primitives like Int#, and the `seq` operator are all strict.
So it's not the case that proofs are lazy and C is strict and Haskell is lazy so it's more like a proof. It's not even accurate to say that C is strict and Haskell is lazy. Within a proof, and within a C and a Haskell program, you can find lazy parts and strict parts.
I feel like the opposite is true but maybe the issue is that we both live in separate bubbles. Often times I see people on X and elsewhere making wild claims about the capabilities of AI and rarely do they link to the actual output.
That said, I agree that AI has been amazing for fairly closed ended problems like writing a basic script or even writing scaffolding for tests (it's about 90% effective at producing tests I'd consider good assuming you give it enough context).
Greenfield projects have been more of a miss than a hit for me. It starts out well but if you don't do a good job of directing architecture it can go off the rails pretty quickly. In a lot of cases I find it faster to write the code myself.
I'm in the same bubble. I find if they do link to it it's some basic unimpressive demo app. That said, I want to see a video where of one of these people that apparently 10x'd there programming go against a dev without AI across various scenarios. I just think it would be interesting to watch if they had a similar base skill & understanding of things.
> That said, I want to see a video where of one of these people that apparently 10x'd there programming go against a dev without AI across various scenarios.
It would be interesting, but do understand that if AI coding is totally fantastic in one domain (basic automation scripting) and totally crappy in another (existing, complex codebase), it's still a (significant) improvement from the pre-AI days.
Concrete example: A few days ago I had an AI model write me a basic MCP tool: Creating a Jira story. In 15 minutes, it had written the API function for me, I manually wrapped it to make it an MCP tool, tested it, and then created tens of stories from a predefined list, and verified it worked.
Now if you already know the Jira APIs (endpoints, auth, etc), you could do it with similar speed. But I didn't. Just finding the docs, etc would take me longer.
Code quality is fine. This is not production code. It's just for me.
Yes, there are other Jira MCP libraries already. It was quicker for me to write my own than to figure out the existing ones (ditto for Github MCP). When using LLMs to solve a coding problem is faster than using Google/SO/official docs/existing libraries, that's clearly a win.
Would I do it this way for production code? No. Does that mean it's bad? No.
I don't think ChatGPT really disrupted Google search? It definitely forced Google to release Gemini + related products though. Google still has millions of users and they now have AI integrated with search. The latest Gemini models are also as capable if not more than some of OAI's models.
I don't see how Altman is going to disrupt Apple with just Ive and a company no one's heard of before.
Machines have been outperforming humans at a variety of tasks for quite a while now. I'm unconvinced that AlphaEvolve can lead to some sort of singularity.
reply