Cloudlflare builds OAuth with Claude and publishes all the prompts

rienbdj · 2025-06-03T06:30:13 1748932213

The commits are revealing.

Look at this one:

> Ask Claude to remove the "backup" encryption key. Clearly it is still important to security-review Claude's code!

> prompt: I noticed you are storing a "backup" of the encryption key as `encryptionKeyJwk`. Doesn't this backup defeat the end-to-end encryption, because the key is available in the grant record without needing any token to unwrap it?

I don’t think a non-expert would even know what this means, let alone spot the issue and direct the model to fix it.

victorbjorklund · 2025-06-03T08:58:34 1748941114

That is how LLM:s should be used today. An expert prompts it and checks the code. Still saves a lot of time vs typing everything from scratch. Just the other day I was working on a prototype and let claude write code for a auth flow. Everything was good until the last step where it was just sending the user id as a string with the valid token. So if you got a valid token you could just pass in any user id and become that user. Still saved me a lot of time vs doing it from scratch.

Vinnl · 2025-06-03T11:26:47 1748950007

At least for me, I'm fairly sure that I'm better at not adding security flaws to my code (which I'm already not perfect at!) than I am at spotting them in code that I didn't write, unfortunately.

bryant · 2025-06-03T13:31:13 1748957473

They're different mindsets. Some folks are better editors, inspectors, auditors, etc, whereas some are better builders, creators, and drafters.

So what you're saying makes sense. And I'm definitely on the other side of that fence.

blueflow · 2025-06-03T13:43:29 1748958209

When you form a mental model and then write code from that, thats a very lossy transformation. You can write comments and documentation to make it less lossy, but there will be information that is lost to an reviewer, who has to spend great effort to recreate it. If it is unknown how code is supposed to behave, then it becomes physically impossible to verify it for correctness.

This is less a matter of "mindset", but more a general problem of information.

bbarnett · 2025-06-03T14:09:43 1748959783

Whether reviewer or creator, if the start conditions / problem is known, both start with the same info.

"code base must do X with Y conditions"

The reviewer is at no disadvantage, other than the ability to walk the problem without coding.

blueflow · 2025-06-03T14:38:04 1748961484

This is the ideal case where the produced code is well readable and commented so its intent is obvious.

The worst case is an intern or LLM having generated some code where the intent is not obvious and them not being able to explain the intent behind it. "How is that even related to the ticket"-style code.

XCSme · 2025-06-03T11:45:41 1748951141

> Still saves a lot of time vs typing everything from scratch.

In my experience, it takes longer to debug/instruct the LLM than to write it from scratch.

Culonavirus · 2025-06-03T12:13:38 1748952818

Depends on what you're doing. For example when you're writing something like React components and using something like Tailwind for styling, I find the speedup is close to 10X.

XCSme · 2025-06-04T12:59:45 1749041985

Scaffolding works fine, for things that are common, and you already have 100x examples on the web. Once you need something more specific, it falls apart and leads to hours of prompting and debugging for something that takes 30 minutes to write from scratch.

Some basic things it fails at:

  * Upgrading the React code-base from Material-UI V4 → V5
  * Implementing a simple header navigation dropdown in HTML/CSS that looks decent and is usable (it kept having bugs with hovering, wrong sizes, padding, responsiveness, duplicated code etc.)
  * Changing anything. About half of the time, it keeps saying "I made those changes", but no changes were made (it happens with all of them, Windsurf, Copilot, etc.).

ksenzee · 2025-06-03T19:43:48 1748979828

This can’t be stressed enough: it depends on what you’re doing. Developers talking about whether LLMs are useful are just talking past each other unless they say “useful for React” or “useful for Rust.” I mostly write Drupal code, and the JetBrains LLM autocomplete saves me a few keystrokes, maybe. It’s not amazing. My theory is that there just isn’t much boilerplate Drupal code out there to train on: everything possible gets pushed out of code and into configuration + UI. If I were writing React components I’d be having an entirely different experience.

nijave · 2025-06-03T15:35:25 1748964925

Isn't there some way to speed up with codegen besides using LLMs?

frank_nitti · 2025-06-03T19:09:10 1748977750

Some may have a better answer, but I often compare with tools like OpenAPI and AsyncAPI generators where HTTP/AMQP/etc code can be generated for servers, clients and extended documentation viewers.

The trade off here would be that you must create the spec file (and customize the template files where needed) which drives the codegen, in exchange for explicit control over deterministic output. So there’s more typing but potentially less cognitive overhead with reviewing a bunch of LLM output.

For this use case I find the explicit codegen UX preferable to inspecting what the LLM decided to do with my human-language prompt, if attempting to have the LLM directly code the library/executable source (as opposed to asking it to create the generator, template or API spec).

rienbdj · 2025-06-04T06:24:57 1749018297

You can require less code by using a more expressive programming language.

azemetre · 2025-06-03T12:30:20 1748953820

Isn’t this because the LLMs had like a million+ react tutorials/articles/books/repos to train on?

I mean I try to use them for svelte or vue and it still recommends react snippets sometimes.

lenglain · 2025-06-04T13:04:04 1749042244

I have had no issues with LLMs trying to force a language on me. I tried the whole snake game test with ChatGPT but Instead of using Python I asked it to use the nodejs bindings for raylib, which is rather unusual.

It did it in no time and no complaints.

Culonavirus · 2025-06-03T13:57:16 1748959036

Generally speaking, "LLMs" that I use are always the latest thinking versions of the flagship models (Grok 3/Gemini 2.5/...). GPT4o (and equivalent) are a mess.

But you're correct, when you use more exotic and/or quite new libraries, the outputs can be of mixed quality. For my current stack (Typescript, Node, Express, React 19, React Router 7, Drizzle and Tailwind 4) both Grok 3 (the paid one with 100k+ context) and Gemini 2.5 are pretty damn good. But I use them for prototyping, i.e. quickly putting together new stuff, for types, refactorings... I would never trust their output verbatim. (YET.) "Build an app that ..." would be a nightmare, but React-like UI code at sufficiently granular level is pretty much the best case scenario for LLMs as your components should be relatively isolated from the rest of the app and not too big anyways.

trillic · 2025-06-03T17:45:12 1748972712

I put these in the Gemini Pro 2.5 system prompt and it's golden for Svelte.

https://svelte.dev/docs/llms

azemetre · 2025-06-03T18:09:30 1748974170

I do this and it still spits out react snippets regardless like 40% of the time... I feel like unless you are doing something extremely basic this is fine but once you introduce state or animations all these systems death spiral.

ambicapter · 2025-06-03T14:39:40 1748961580

Yes, definitely. Act accordingly.

lovich · 2025-06-03T16:56:07 1748969767

I use https://visualstudio.microsoft.com/services/intellicode/ for my IDE which learns on your codebase, so it does end up saving me a ton of time after its learned my patterns and starts suggesting entire classes hooked up to the correct properties in my EF models.

It lets me still have my own style preferences with the benefit of AI code generation. Bridged the barrier I had with code coming from Claude/ChatGPT/etc where its style preferences were based on the wider internets standards. This is probably a preference on the level of tabs vs spaces, but ¯\_(ツ)_/¯

zx8080 · 2025-06-03T12:09:28 1748952568

> An expert prompts it and checks the code. Still saves a lot of time vs typing everything from scratch.

It's a lie. The marketing one, to be specific, which makes it even worse.

victorbjorklund · 2025-06-03T16:56:31 1748969791

0points · 2025-06-03T10:30:13 1748946613

I really don't agree with the idea that expert time would just be spent typing, and I'd be really surprised if that's the common sentiment around here.

An expert reasons, plans ahead, thinks and reasons a little bit more before even thinking about writing code.

If you are measuring productivity by lines of code per hour then you don't understand what being a dev is.

brailsafe · 2025-06-03T11:20:13 1748949613

> I really don't agree with the idea that expert time would just be spent typing, and I'd be really surprised if that's the common sentiment around here.

They didn't suggest that at all, they merely suggested that the component of the expert's work that would otherwise be spent typing can be saved, while the rest of their utility comes from intense scrutiny, problem solving, decision making about what to build and why, and everything else that comes from experience and domain understanding.

fc417fc802 · 2025-06-03T14:13:21 1748960001

It's not just time spent typing. Figuring out what needs to be typed can be both draining and time consuming. It's often (but not always) much easier to review someone else's solution to the problem than it is to solve it from scratch on your own.

Oddly enough security critical flows are likely to be one of the few exceptions because catching subtle reasoning errors that won't trip any unit tests when reviewing code that you didn't write is extremely difficult.

oblio · 2025-06-03T15:15:13 1748963713

The problem is, building something IS the destination. At least the first 5-10 times. Building and fixing along the way is what builds lasting knowledge for most people.

kiitos · 2025-06-04T00:02:04 1748995324

Time spent typing is statistically 0% of overall time spent in developing/implementing/shipping a feature or product or whatever. There's literally no reason to try to optimize that irrelevant detail.

victorbjorklund · 2025-06-03T16:57:28 1748969848

Yea, and you still do that now. Lets say you spend 30% of your time coding and the rest planning. Well, now you got even more time for planning.

otabdeveloper4 · 2025-06-03T09:30:10 1748943010

> Still saves a lot of time vs typing everything from scratch

No it doesn't. Typing speed is never the bottleneck for an expert.

As an offline database of Google-tier knowledge, LLM's are useful. Though current LLM tech is half-baked, we need:

a) Cheap commodity hardware for running your own models locally. (And by "locally" I mean separate dedicated devices, not something that fights over your desktop's or laptop's resources.)

b) Standard bulletproof ways to fine-tune models on your own data. (Inference is already there mostly with things like llama.cpp, finetuning isn't.)

boruto · 2025-06-03T10:12:54 1748945574

I realize I procrastinate less when using LLM to write code which I know I could write.

kentonv · 2025-06-03T13:42:42 1748958162

I've noticed this too.

I remember hearing somewhere that humans have a limited capacity in terms of number of decisions made in a day, and it seems to fit here: If I'm writing the code myself, I have to make several decisions on every line of code, and that's mentally tiring, so I tend to stop and procrastinate frequently.

If an LLM is handling a lot of the details, then I'm just making higher-level decisions, allowing me to make more progress.

Of course this is totally speculation and theories like this tend to be wrong, but it is at least consistent with how I feel.

autoexec · 2025-06-03T17:01:30 1748970090

I have a feeling that it's something that might help today but also something you might pay for later. When you have to maintain or bug fix that same code down the line the fact that you were the one making all those higher-level decisions and thinking through the details gives you an advantage. Just having everything structured and named in ways that make the most sense to you seems like it'd be helpful the next time you have to deal with the code.

While it's often a luxury, I'd much rather work on code I wrote than code somebody else wrote.

victorbjorklund · 2025-06-03T17:00:45 1748970045

Maybe you type faster than me then :) I for sure type slower than Claude code. :)

brailsafe · 2025-06-03T11:26:45 1748950005

> No it doesn't. Typing speed is never the bottleneck for an expert

How could that possibly be true!? Seems like it'd be the same as suggesting being constrained to analog writing utensils wouldn't bottleneck the process of publishing a book or research paper. At the very least such a statement implies that people with ADHD can't be experts.

thisissomething · 2025-06-03T12:40:13 1748954413

Completely agree with you. I was working on the front-end of an application and I prompted Claude the following: "The endpoint /foo/bar is returning the json below ##json goes here##, show this as cards inside the component FooBaz following the existing design system".

In less than 5 minutes Claude created code that: - encapsulated the api call - modeled the api response using Typescript - created a re-usable and responsive ui component for the card (including a load state) - included it in the right part of the page

Even if I typed at 200wpm I couldn't produce that much code from such a simple prompt.

I also had similar experiences/gains refactoring back-end code.

This being said, there are cases in which writing the code yourself is faster than writing a detailed enough prompt, BUT those cases are becoming exception with new LLM iteration. I noticed that after the jump from Claude 3.7 to Claude 4 my prompts can be way less technical.

oblio · 2025-06-03T15:24:57 1748964297

The thing is... does your code end there? Would you put that code in production without a deep analysis of what Claude did?

s900mhz · 2025-06-04T03:33:25 1749008005

I’m not who you replied to but I keep functions small and testable paired with unit tests with a healthy mix of happy/sad path.

Afterwards I make sure the LLM passes all the tests before I spend my time to review the code.

I find this process keeps the iterations count low for review -> prompt -> review.

I personally love writing code with an LLM. I’m a sloppy typist but love programming. I find it’s a great burnout prevention.

For context: node.js development/React (a very LLM friendly stack.)

brailsafe · 2025-06-03T18:10:57 1748974257

(GP) I wouldn't, but it would get me close enough that I can do the work that's more intellectually stimulating. Sometimes you need the people to do the concrete for a driveway, and sometimes you need to be signing off on the way the concrete was done, perhaps making some tweaks during the early stages.

throwaway0123_5 · 2025-06-03T13:07:10 1748956030

It seems fair to say that it is ~never the overall bottleneck? Maybe once you figure out what you want, typing speed briefly becomes the bottleneck, but does any expert finish a day thinking "If only I could type twice as fast, I'd have gotten twice as much work done?" That said, I don't think "faster typing" is the only benefit that AI assistance provides.

otabdeveloper4 · 2025-06-03T12:37:37 1748954257

> How could that possibly be true!?

(I'll assume you're not joking, because your post is ridiculous enough to look like sarcasm.)

The answer is because programmers read code 10 times more (and think about code 100 times more) than they write it.

thisissomething · 2025-06-03T12:46:10 1748954770

Yeah, but how fast can you write compared to how fast you think?

How many times have you read a story card and by the time you finished reading it you thought "It's an easy task, should take me 1 hour of work to write the code and tests"?

In my experience, in most of those cases the AI can do the same amount of code writing in under 10 minutes, leaving me the other 50 minutes to review the code, make/ask for any necessary adjustments, and move on to another task.

dns_snek · 2025-06-03T12:55:28 1748955328

I don't know anyone who can think faster than they can type (on average), they would have to have an IQ over 150 or something. For mere mortals like myself, reasoning through edge cases and failure conditions and error handling and state invariants takes time. Time that I spend looking at a blinking cursor while the gears spin, or reading code. I've never finished a day where I thought to myself "gosh darn, if only I could type faster this would be done already".

skydhash · 2025-06-03T16:50:46 1748969446

You could be fast if you were coding only the happy path, like a lot of juniors do. Instead of thinking about trivial things like malformed input, library semantics, framework gotchas and what not.

brailsafe · 2025-06-03T18:16:22 1748974582

I wasn't joking, it's a bottleneck sometimes, that's it. It's a bottleneck like comfort and any good tool is a bottleneck, like a slow computer is a bottleneck. It's silly to suggest that your ability to rapidly use a fundamental tool is never a bottleneck, no matter what other bits need to come into play during the course of your day.

My ability to review and understand intent behind code isn't a primarily bottleneck to me actually efficiently reviewing code when it's requested of me, the primary bottleneck is being notified at the right time that I have a waiting request to review code.

If compilers were never a bottleneck, why would we ever try to make them faster? If build tools were never a bottleneck, why would we ever optimize those? These are all just some of the things that can stand between the identification of a problem and producing a solution for it.

signa11 · 2025-06-03T11:59:44 1748951984

> ... Still saves a lot of time vs typing everything from scratch ...

how ? the prompts have still to be typed right ? and then the output examined in earnest.

fastball · 2025-06-03T12:25:35 1748953535

A prompt can be as little as a sentence to write hundreds of lines of code.

shaky-carrousel · 2025-06-03T16:26:06 1748967966

Hundreds of lines that you have to carefully read and understand.

fastball · 2025-06-04T00:55:33 1748998533

Are you not doing that already?

I go line-by-line through the code that I wrote (in my git client) before I stage+commit it.

victorbjorklund · 2025-06-03T17:00:10 1748970010

Depends on what it is doing. A html template without JS? Enough to just check if it looks right and works.

ImPostingOnHN · 2025-06-03T19:09:11 1748977751

You also have to do that with code you write without LLM assistance.

victorbjorklund · 2025-06-03T16:59:09 1748969949

Latest project I been working on. Prompts are a few sentences (and technically I dictate them instead of typing) and the LLM generates a few hundred lines of code.

fragmede · 2025-06-03T16:03:55 1748966635

not if you don't want to. speech to text is pretty good these days, and even eg aider has a /voice command thanks to OpenAI's whisper.

blinded · 2025-06-04T00:18:01 1748996281

Sure! But over half the fun of coding is writing and learning.

dismalaf · 2025-06-03T16:51:14 1748969474

> Still saves a lot of time vs typing everything from scratch

Probably very language specific. I use a lot of Ruby, typing things takes no time it's so terse. Instead I get to spend 95% of my time pondering my problems (or prompting the LLM)...

deepsun · 2025-06-03T16:58:29 1748969909

With a proper IDE you don't type much even in Java/.Net, it's all autocomplete anyway. "Too verbose" complaints are mostly from Notepad lovers, and those who never needed to read somebody else's code.

victorbjorklund · 2025-06-03T16:56:13 1748969773

It can create a whole dashboard view in elixir in a few seconds that is 100 lines long. No way I can type that in the same time.

QuadmasterXLII · 2025-06-03T17:46:13 1748972773

If you're making a dashboard view your productivity is zero, making it faster just multiplies zero by a bigger number.

Edit: this comment was more a result of me being in a terrible mood than a true claim. Sorry.

dismalaf · 2025-06-03T17:15:32 1748970932

In my experience the problem is never creating the dashboard view (there's a million examples of it out there anyway to copy/paste), but making sense of the data. Especially if you're doing anything even remotely novel.

827a · 2025-06-03T13:37:51 1748957871

I tend to disagree, but I don't know what my disagreement means for the future of being able to use AI when writing software. This workers-oauth-provider project is 1200 lines of code. An expert should be able to write that on the scale of an hour.

The main value I've gotten out of AI writing software comes from the two extremes; not from the middle-ground you present. Vibe coding can be great and seriously productive; but if I have to check it or manually maintain it in nearly any capacity more complicated than changing one string, productivity plummets. Conversely; delegating highly complex, isolated function writing to an AI can also be super productive, because it can (sometimes) showcase intelligence beyond mine and arrive at solutions which would take me 10x longer; but definitionally I am not the right person to check its code output; outside of maybe writing some unit tests for it (a third thing AI tends to be quite good at)

kentonv · 2025-06-03T20:12:20 1748981540

> This workers-oauth-provider project is 1200 lines of code. An expert should be able to write that on the scale of an hour.

Are you being serious here?

Let's do the math.

1200 lines in a hour would be one line every three seconds, with no breaks.

And your figure of 1200 lines is apparently omitting whitespace and comments. The actual code is 2626 lines. Let's say we ignore blank lines, then it's 2251 lines. So one line per ~1.6 seconds.

The best typists type like 2 words per second, so unless the average line of code has 3 words on it, a human literally couldn't type that fast -- even if they knew exactly what to type.

Of course, people writing code don't just type non-stop. We spend most of our time thinking. Also time testing and debugging. (The test is 2195 lines BTW, not included in above figures.) Literal typing of code is a tiny fraction of a developer's time.

I'd say your estimate is wrong by at least one, but realistically more likely two orders of magnitude.

827a · 2025-06-03T22:37:50 1748990270

"On the scale of an hour" means "within an order of magnitude of one hour", or either "10 minutes to 10 hours" or "0.1 hours to 10 hours" depending on your interpretation, either is fine.

fc417fc802 · 2025-06-03T14:15:51 1748960151

> An expert should be able to write that on the scale of an hour.

An expert in oauth, perhaps. Not your typical expert dev who doesn't specialize in auth but rather in whatever he's using the auth for. Navigating those sorts of standards is extremely time consuming.

827a · 2025-06-03T19:19:37 1748978377

Maybe, but also: Cloudflare is one of like fifteen organizations on the planet writing code like this. The vast majority of The Rest Of Us will just consume code like this, which companies like Cloudflare, Auth0, etc write. That tends to be the nature of highly-specialized highly-domain-specific code. Cloudflare employs those mythical Oauth experts you talk about.

kentonv · 2025-06-03T20:33:24 1748982804

That's me. I'm the expert.

On my very most productive days of my entire career I've managed to produce ~1000 lines of code. This library is ~5000 (including comments, tests, and documentation, which you omitted for some reason). I managed to prompt it out of the AI over the course of about five days. But they were five days when I also had a lot of other things going on -- meetings, chats, code reviews, etc. Not my most productive.

So I estimate it would have taken me 2x-5x longer to write this library by hand.

i5heu · 2025-06-03T07:26:36 1748935596

Revealing against what?

If you look at the README it is completely revealed... so i would argue there is nothing to "reveal" in the first place.

> I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh... the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.

> To emphasize, this is not "vibe coded". Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs.

JW_00000 · 2025-06-03T09:02:53 1748941373

I think OP meant "revealing" as in "enlightening", not as "uncovering something that was hidden intentionally".

rienbdj · 2025-06-03T08:52:09 1748940729

> Revealing against what?

Revealing of what it is like working with an LLM in this way.

kortilla · 2025-06-03T14:45:42 1748961942

Revealing the types of critical mistakes LLMs make. In particular someone that didn’t already understand OAuth likely would not have caught this and ended up with a vulnerable system.

risyachka · 2025-06-03T08:02:55 1748937775

If the guy knew how to properly implement oauth - did he save any time though by prompting or just tried to prove a point that if you actually already know all details of impl you can guide llm to do it?

Thats the biggest issue I see. In most cases I don't use llm because DIYing it takes less time than prompting/waiting/checking every line.

JimDabell · 2025-06-03T08:24:44 1748939084

> did he save any time though

Yes:

> It took me a few days to build the library with AI.

> I estimate it would have taken a few weeks, maybe months to write by hand.

– https://news.ycombinator.com/item?id=44160208

> or just tried to prove a point that if you actually already know all details of impl you can guide llm to do it?

No:

> I was an AI skeptic. I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel. I started this project on a lark, fully expecting the AI to produce terrible code for me to laugh at. And then, uh... the code actually looked pretty good. Not perfect, but I just told the AI to fix things, and it did. I was shocked.

— https://github.com/cloudflare/workers-oauth-provider/?tab=re...

autoexec · 2025-06-03T17:08:57 1748970537

> I thoughts LLMs were glorified Markov chain generators that didn't actually understand code and couldn't produce anything novel.

How novel is a OAuth provider library for cloudflare workers? I wouldn't be surprised if it'd been trained on multiple examples.

kentonv · 2025-06-03T17:15:09 1748970909

I'm not aware of any other OAuth provider libraries for Workers. Plenty of clients, but not providers -- implementing the provider side is not that common, historically. See my other comment:

https://news.ycombinator.com/item?id=44164204

theshrike79 · 2025-06-03T08:38:25 1748939905

Do people save time by learning to write code at 420WPM? By optimising their vi(m) layouts and using languages with lots of fancy operators that make things quicker to write?

Using an LLM to write code you already know how to write is just like using intellisense or any other smart autocomplete, but at a larger scale.

throwaway2037 · 2025-06-03T07:14:14 1748934854

While I think this is a cool (public) experiment by Claude, asking an LLM to write security-sensitive code seems crazy at this point. Ad absurdum: Can you imagine asking Claude to implement new functionality in OpenSSL libs!?

PeterStuer · 2025-06-03T07:26:57 1748935617

Which is exactly why AI coding assistants work with your expertise rather than replace it. Most people I see fail at AI assisted development are either non-technical people expecting the AI will solve it all, or technical people playing gotcha with the machine rather than collaborating with it.

bootsmann · 2025-06-03T06:45:58 1748933158

There is also one quite early in the repo where the dev has to tell Claude to store only the hashes of secrets

kentonv · 2025-06-03T13:34:46 1748957686

Yeah I was disappointed in that one.

I hate to say, though, but I have reviewed a lot of human code in my time, and I've definitely caught many humans making similar-magnitude mistakes. :/

hn_throwaway_99 · 2025-06-03T15:08:04 1748963284

I just wanted to say thanks so much publishing this, and especially your comments here - I found them really helpful and insightful. I think it's interesting (though not unexpected) that many of the other commenters' comments here show what a Rorschach test this is. I think that's kind of unfortunate, because your experience clearly showed some of the benefits and limitations/pitfalls of coding like this in an objective manner.

I am curious, did you find the work of reviewing Claude's output more mentally tiring/draining than writing it yourself? Like some other folks mentioned, I generally find reviewing code more mentally tiring than writing it, but I get a lot of personal satisfaction by mentoring junior developers and collaborating with my (human) colleagues (most of them anyway...) Since I don't get that feeling when reviewing AI code, I find it more draining. I'm curious how you felt reviewing this code.

kentonv · 2025-06-03T15:56:56 1748966216

I find reviewing AI code less mentally tiring that reviewing human code.

This was a surprise to me! Until I tried it, I dreaded the idea.

I think it is because of the shorter feedback loop. I look at what the AI writes as it is writing it, and can ask for changes which it applies immediately. Reviewing human code typically has hours or days of round-trip time.

Also with the AI code I can just take over if it's not doing the right thing. Humans don't like it when I start pushing commits directly to their PR.

There's also the fact that the AI I'm prompting is, obviously, working on my priorities, whereas humans are often working on other priorities, but I can't just decline to review someone's code because it's not what I'm personally interested in at that moment.

When things go well, reviewing the AI's work is less draining than writing it myself, because it's basically doing the busy work while I'm still in control of high-level direction and architecture. I like that. But things don't always go well. Sometimes the AI goes in totally the wrong direction, and I have to prompt it too many times to do what I want, in which case it's not saving me time. But again, I can always just cancel the session and start doing it myself... humans don't like it when I tell them to drop a PR and let me do it.

Personally, I don't generally get excited about mentoring and collaborating. I wish I did, and I recognize it's an important part of my job which I have to do either way, but I just don't. I get excited primarily about ideas and architecture and not so much about people.

hn_throwaway_99 · 2025-06-03T18:29:12 1748975352

Thank you so much for your detailed, honest, and insightful response! I've done a bunch of AI-assisted coding to varying degrees of success, but your comment here helped me think about it in new ways so that I can take the most advantage of it.

Again, I think your posting of this is probably the best actual, real world evidence that shows both the pros and cons of AI-assisted coding, dispassionately. Awesome work!

jjcm · 2025-06-03T18:30:39 1748975439

Most interesting aspect of this is it likely learned this pattern from human-written code!

kentonv · 2025-06-04T15:24:10 1749050650

It's not a 100% bad idea. If you lose the encryption key, you lose the data. Data loss is bad! So better keep a backup of the key somewhere. I can see how it got there.

Defeats the purpose in this case though.

ActionHank · 2025-06-03T15:11:51 1748963511

But AIbros will be running around telling everyone that Claude invented OAuth for Cloudflare all on its own and then opensourced it.

bananapub · 2025-06-03T11:56:38 1748951798

this seems like a true but pointless observation? if you're producing security-sensitive code then experts need to be involved, whether that's me unwisely getting a junior to do something, or receiving a PR from my cat, or using an LLM.

removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforces / former-NFT fanboys keep pushing, just letting an LLM generate code for a human to review then send out for more review is really pretty boring and already very effective for simple to medium-hard things.

toofy · 2025-06-04T00:09:49 1748995789

> …removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforce…

this is completely expected behavior by them. departments with well paid experts will be one of the first they’ll want to cut. in every field. experts cost money.

we’re a long, long, long way off from a bot that can go into random houses and fix under the sink plumbing, or diagnose and then fix an electrical socket. however, those who do most of their work on a computer, they’re pretty close to a point where they can cut these departments.

in every industry in every field, those will be jobs cut first. move fast and break things.

hn_throwaway_99 · 2025-06-03T15:12:37 1748963557

I think it's a critically important observation.

I thought this experience was so helpful as it gave an objective, evidence-based sample on both the pros and cons of AI-assisted coding, where so many of the loudest voices on this topic are so one-sided ("AI is useless" or "developers will be obsolete in a year"). You say "removing expert humans from the loop is the deeply stupid thing the Tech Elite Who Want To Crush Their Own Workforces / former-NFT fanboys keep pushing", but the fact is many people with the power to push AI onto their workers are going to be more receptive to actual data and evidence than developers just complaining that AI is stupid.

october8140 · 2025-06-03T07:24:38 1748935478

It's a Jr Developer that you have to check all their code over. To some people that is useful. But you're still going to have to train Jr Developers so they can turn into Sr Developers.

PeterStuer · 2025-06-03T07:32:27 1748935947

I don't like the jr dev analogy. It neither has the same weaknesses nor the same strenghts.

It's more like the genious coworker that has an overassertive ego and sometimes shows up drunk, but if you know how to work with them and see past their flaws, can be a real asset.

hn_throwaway_99 · 2025-06-03T15:16:01 1748963761

I also like your analogy, but it also explains why I find working with AI-assisted coding so mentally tiresome.

It's like with some auto-driving systems - I say it like having a slightly inebriated teenager at the wheel. I can't just relax and read a book, because then I'd die. But so I have to be more mentally alert than just driving myself because everything could be going smoothly and relaxed, but at any moment the driving system could decide to drive into a tree.

Cthulhu_ · 2025-06-03T12:25:44 1748953544

I don't really agree; a junior developer, if they're curious enough, wouldn't just write insecure code, they would do self-study and find out best practices etc before writing code, including not storing plaintext passwords and the like.

hn_throwaway_99 · 2025-06-03T15:16:40 1748963800

You have clearly only ever worked with the creme de la creme of junior developers.

paxys · 2025-06-02T15:04:53 1748876693

This is exactly the direction I expect AI-assisted coding to go in. Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.

The million dollar (perhaps literally) question is – could @kentonv have written this library quicker by himself without any AI help?

kentonv · 2025-06-02T16:00:25 1748880025

It took me a few days to build the library with AI.

I estimate it would have taken a few weeks, maybe months to write by hand.

That said, this is a pretty ideal use case: implementing a well-known standard on a well-known platform with a clear API spec.

In my attempts to make changes to the Workers Runtime itself using AI, I've generally not felt like it saved much time. Though, people who don't know the codebase as well as I do have reported it helped them a lot.

I have found AI incredibly useful when I jump into other people's complex codebases, that I'm not familiar with. I now feel like I'm comfortable doing that, since AI can help me find my way around very quickly, whereas previously I generally shied away from jumping in and would instead try to get someone on the team to make whatever change I needed.

michelsedgh · 2025-06-03T06:21:02 1748931662

The fascinating part is that each person is finding their own way of using these tools from kids to elders and everyone in between no matter what your background or language or whatever is

protocolture · 2025-06-03T10:00:56 1748944856

This. Lots of people talking up agents right now, but the conversational rubber duck thing hits the spot well for me.

srhtftw · 2025-06-02T20:56:30 1748897790

> It took me a few days to build the library with AI. ... > I estimate it would have taken a few weeks, maybe months to write by hand.

I don't think this is a fair assessment give the summary of the commit history https://pastebin.com/bG0j2ube shows your work started on 2025-02-27 and started trailing off at 2025-03-20 as others joined in. Minor changes continue to present.

> That said, this is a pretty ideal use case: implementing a well-known standard on a well-known platform with a clear API spec.

Still, this allowed you to complete in a month what may have taken two. That's a remarkable feat considering the time and value of someone of your caliber.

kentonv · 2025-06-03T02:27:55 1748917675

I think the data supports that there were about 5 distinct days when I did a large amount of work on this library, and a sprinkling of minor commits through the rest of the month. Glen's commits, while numerous, were also fairly minor, mostly logistical details around releases.

This library is not the only thing I was working on, nor even the main thing. As the lead engineer of Cloudflare Workers I have quite a few other things demanding my time.

motorest · 2025-06-03T05:07:36 1748927256

> (...) your work started on 2025-02-27 and started trailing off at 2025-03-20 as others joined in. Minor changes continue to present.

Your analysis is far too superficial to extract anything meaningful. I know for a fact that I have small projects that took me only a couple of days to get done which have a commit history ranging a few months. Also, software is never done. There's always room to refactor, and LLMs turn that into trivial problems. Lastly, is your project still under development if your commits are README updates, linter runs, and renaming variables?

There is a reason why commit history is not used to track productivity.

manquer · 2025-06-02T21:39:35 1748900375

Is it though?

Would someone of author's caliber even be working on trivial slog item like Oauth2 implementation, if not for the novel development approach he wanted to attempt here ?

For the kind of regular jobs a engineer typically is expected to do, would it give 100% productivity jump ?

srhtftw · 2025-06-02T23:55:26 1748908526

Many tools make lesser developers more productive (to a point) but they fail to improve the productivity of talented professionals. Lots of "no/low" code things come to mind. But here's a tool that made kentonv 2x productive at a task that's clearly in his wheelhouse. It seems under the right conditions it can improve the productivity of developers at the opposite end of the spectrum.

What other tools could do that?

andyferris · 2025-06-03T04:44:35 1748925875

To answer your question explicitly, we do have existing tools that help on that end, but they are nerdy and not hyped by beginners.

Type systems, LSPs, tests, formatters, Rust’s borrow checker, logs and traces, source control are examples of things that make experts go faster. This space is hardly neglected (but could always be better).

It is really nice to see LLMs helping on all skill levels.

9dev · 2025-06-02T19:50:24 1748893824

Funny thing. I have built something similar recently, that is a 2.1-compliant authorisation server in TypeScript[0]. I did it by hand, with some LLM help on the documentation. I think it took me about two weeks full time, give or take, and there’s still work to do, especially on the testing side of things, so I would agree with your estimate.

I’m going to take a very close look at your code base :)

[0] https://github.com/colibri-hq/colibri/blob/next/packages/oau...

upstairs-war · 2025-06-02T20:54:40 1748897680

Thanks kentonv. I picked up where you left off, supported with oauth2.1 rfc, and integrated ms oauth to our internal mcp server. Cool to have Claude be business aware

graeme · 2025-06-03T06:34:03 1748932443

>I have found AI incredibly useful when I jump into other people's complex codebases, that I'm not familiar with. I now feel like I'm comfortable doing that

This makes sense. Are there codebases where you find this doesn't work as well, either from the codebase's min required context size or the code patterns not being in the training data?

aprilthird2021 · 2025-06-03T05:41:02 1748929262

Matches my experiences well. Making changes to large, complex codebases I know well? Teaching the AI to get up to speed with me takes too much time.

Code I know nothing about? AI is very helpful there

philipwhiuk · 2025-06-02T16:06:26 1748880386

> Though, people who don't know the codebase as well as I do have reported it helped them a lot.

My problem I guess is that maybe this is just Dunning-Kruger esq. When you don't know what you don't know you get the impression it's smart. When you do, you think it's rubbish.

Like when you see a media report on a subject you know about and you see it's inaccurate but then somehow still trust the media on a subject you're a non-expert on.

motorest · 2025-06-03T05:13:59 1748927639

> My problem I guess is that maybe this is just Dunning-Kruger esq. When you don't know what you don't know you get the impression it's smart. When you do, you think it's rubbish.

I see your point. Indeed there are two completely different points of view regarding the output of LLMs:

* Hey, I managed to vibecode my way into a fully working web service with a React SPA after a couple of prompts, and a full automated test suite to boot.

* This project is nowhere as clean as I would have written it, and doesn't even follow my pet coding conventions.

One side lauds LLMs, the other complains they output mainly crap.

The truth of the matter is that the vast majority of software engineers write crap code, as the definition of "crap code" is "something I would have done differently". Opinionated engineers look at the output of LLMs and accuse it of being crap code. Eppur si muove.

phatskat · 2025-06-03T23:16:09 1748992569

> The truth of the matter is that the vast majority of software engineers write crap code, as the definition of "crap code" is "something I would have done differently".

This is certainly a part of it, but I do wonder that even if an LLM “learned” the conventions and preferences of an engineer and spit out “perfectly styled” code, would it be treated as such? I’d wager (a small amount) that it wouldn’t, because part of enjoying the code - for me - is _knowing_ the code. “I wrote it this way because I tried X, then Y, then saw I could do Z, and now I’m familiar with the code in a way that’s more intimate.” Unfamiliar code rarely looks like _really good_, in my opinion.

throwaway314155 · 2025-06-02T18:35:42 1748889342

I think most of this just amounts to the same old good developers vs. bad developers situation that we've been in for decades.

giantrobot · 2025-06-02T21:13:25 1748898805

> Like when you see a media report on a subject you know about and you see it's inaccurate but then somehow still trust the media on a subject you're a non-expert on.

Gell-Mann Amnesia https://en.m.wikipedia.org/wiki/Gell-Mann_amnesia_effect

gokhan · 2025-06-02T15:23:43 1748877823

> Not software engineers being kicked out ... but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.

But what if you only need 2 kentonv's instead of 20 at the end? Do you assume we'll find enough new tasks that will occupy the other 18? I think that's the question.

And the author is implementing a fairly technical project in this case. How about routine LoB app development?

thewebguyd · 2025-06-02T15:39:34 1748878774

> But what if you only need 2 kentonv's instead of 20 at the end? Do you assume we'll find enough new tasks that will occupy the other 18? I think that's the question.

This is likely where all this will end up. I have doubts that AI will replace all engineers, but I have no doubt in my mind that we'll certainly need a lot less engineers.

A not so dissimilar thing happened in the sysadmin world (my career) when everything transitioned from ClickOps to the cloud & Infrastructure as Code. Infrastructure that needed 10 sysadmins to manage now only needed 1 or 2 infrastructure folks.

The role still exists, but the quantity needed is drastically reduced. The work that I do now by myself would have needed an entire team before AWS/Ansible/Terraform, etc.

kentonv · 2025-06-02T16:47:21 1748882841

I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.

But if the time it takes an engineer to build any one thing goes down, now there are a lot more things that are cost effective.

Consider niche use cases. Every company tends to have custom processes and workflows. Think about being an accountant at one company vs. another -- while a lot of the job is the same, there will always be parts that are significantly different. Those bespoke processes often involve manual labor because off-the-shelf accounting software cannot add custom features for every company.

But what if it could? What if an engineer working with AI could knock out customer-specific features 10x as fast as they could in the past. Now it actually makes sense to build those features, to improve the productivity of each company's accounting department.

It's hard to say if demand for engineers will go down or up. I'm not pretending to know for sure. But I can see a possibility that we actually have way more developers in coming years!

thewebguyd · 2025-06-02T19:31:44 1748892704

> I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.

That's definitely an interesting area, but I think we'll actually see (maybe) individual employees solving some of these problems on their own without involving IT/the dev team.

We kind of see it already - a lot of these problem spaces are being solved with complex Excel workflows, crappy Access databases, etc. because the team needed their problem solved now, and resources couldn't be given to them.

Maybe AI is the answer to that so that instead of building a house of cards on Excel, these non-tech teams can have something a little more robust.

It's interesting you mentioned accounting, because that's the one department/area I see taking off and running with it the most. They are already the department that's effectively programming already with Excel workflows & DSLs in whatever ERP du jour.

So it doesn't necessarily open up more dev jobs, but maybe fulfills the old the mantra of "everyone will become a programmer." and we see more advanced computing become a commodity thanks to AI - much like everyone can click their way through an office suite with little experience or training, everyone will be able to use AI to automate large chunks of their job or departmental processes.

ktzar · 2025-06-03T05:13:50 1748927630

If we shiver at the sight of some of those accounting-created excels, which we only learn about when they fail and they can't understand them anymore, wait for them to hand over a vibe-coded 200k loc Python codebase "which is not working anymore" and nobody had ever reviewed a single line of code.

kentonv · 2025-06-02T19:47:37 1748893657

> I think we'll actually see (maybe) individual employees solving some of these problems on their own without involving IT/the dev team.

I agree, but in my book, those employees are now developers. And so by that definition, there will be a lot more developers.

Will we see more or fewer people whose primary job is software development? That's harder to answer. I do think we'll see a lot more consultant-type roles, with experienced software developers helping other people write their own personal automations.

motorest · 2025-06-03T04:37:30 1748925450

> I think there's a huge huge space of software to build that isn't being touched today because it's not cost-effective to have an engineer build them.

LLMs don't change that. If a business does not have the budget for a software engineer, LLMs won't make up budget headroom for it either. What LLMs do is allow engineers to iterate faster, and work on more tasks. This means less jobs.

petersellers · 2025-06-03T05:18:24 1748927904

If a business has the budget for 1 or 2 engineers though, they might be able to task them with work that previously required 5-10 engineers (in theory, anyways).

motorest · 2025-06-03T05:51:06 1748929866

Right, but even the way you opted to frame this discussion is based on the idea that there is a drop in demand for software engineers. You need less engineers, not more. A few can get more done, but you need fewer to accomplish your tasks too.

simiones · 2025-06-03T10:43:12 1748947392

This is like claiming that there are fewer people who work in construction now than in the year 1000 because a machine can do what it would have literally taken 100 people to accomplish back then.

But what has happened instead is that we are now building much more buildings and much more complex ones than we ever would have even conceived of back then. The Three Gorges dam required the work of thousands or even tens of thousands of people when it was built, and it would have required the work of millions in the year 1000. But it didn't actually generate millions of jobs in the year 1000: it was in fact never even conceived of as a possibility, much less attempted.

Of course, the opposite can also happen. The number of carpenters has reduced to almost nothing, when it used to be a major profession, and there are many other professions that have entirely disappeared.

petersellers · 2025-06-03T06:02:32 1748930552

I didn't frame it that way - perhaps you are thinking of the person you replied to?

Nevertheless, I don't think they are trying to frame it that way, either. The point is that making software development easier can actually increase the demand of software engineers in some cases (where projects that were previously not considered due to budget constraints are now feasible).

motorest · 2025-06-03T06:31:26 1748932286

> I didn't frame it that way - perhaps you are thinking of the person you replied to?

You did. You explicitly asserted the following.

> If a business has the budget for 1 or 2 engineers though, they might be able to task them with work that previously required 5-10 engineers (...).

In your own words, a project that would take 5-10 engineers is now feasible to be tackled with 1 or 2. Your own words.

> (...) The point is that making software development easier can actually increase the demand of software engineers in some cases (...)

I think that's somewhere between unrealistic and wishful thinking. Even in your problem statement, "making software development easier" lowers demand. Even if you argue that some positions might open where none existed before, the truth of the matter is that at the core of your scenario lies a drop in demand for software engineers. Shops who currently employ engineers won't need to retain as many to maintain their current level of productivity.

petersellers · 2025-06-03T07:16:33 1748934993

> In your own words, a project that would take 5-10 engineers is now feasible to be tackled with 1 or 2. Your own words.

That statement != lower demand for software engineers.

If a firm needs to perform project X that previously cost 10 engineers to do, but they only have the budget for 2, they will not tackle that project. Engineers used = 0.

However, if due to productivity enhancements with AI, the project can now be done with just 2 engineers, the company can now afford to tackle the project. Engineers used = 2.

That is the point that the person you were originally replying to was making.

> Even in your problem statement, "making software development easier" lowers demand.

Incorrect, as shown above.

> Even if you argue that some positions might open where none existed before, the truth of the matter is that at the core of your scenario lies a drop in demand for software engineers.

I see what you are trying to say, but it's not that clear cut. The fact is, no one knows what will actually happen to software engineering demand in the long run. Some scenarios will increase demand for engineers, others will decrease it. No one knows what the net demand will be, everyone is only guessing at this point.

basfo · 2025-06-03T11:11:30 1748949090

> If a firm needs to perform project X that previously cost 10 engineers to do, but they only have the budget for 2, they will not tackle that project. Engineers used = 0.

0 on that Project, but those 2 engineers will still be used on a different Project that needs just 2 Engineers.

BUT a company that sees that project as a critical part of the bussines and MUST tackle that project, will only need the 2 engineers in the payroll. Or hire just 2 instead of 10.

Engineers not hired = 8

Or.. maybe they don't really need that project that needs 10 engineers. They are ok as they are today, but they realize that with AI, they don't need those 2 engineers anymore to produce the same output, probably can be handled by just one with AI assistance.

Engineers fired = 1

ath92 · 2025-06-04T07:42:47 1749022967

But now every firm has access to AI. If a firm that doesn’t fire people but instead simply boosts productivity, they will out compete their competitors. The only way to compete with that firm is to also hire enough employees and give them AI tools.

hn_acc1 · 2025-06-02T19:06:26 1748891186

After 30+ years in the software field, and a user for 40+, having at times heavily customized my desktop or editor, for example - I've concluded that the best thing for most apps is for me to learn to use them with stock settings.

Why? Inevitably, I changed positions / jobs / platforms, and all that effort was lost / inapplicable, and I had to relearn to use the stock settings anyway.

Now, I understand that some companies have different setups, but it might just make more sense to change the company's accounting procedures (if possible) to conform to most accounting software defaults, rather than invest heavily in modifying the setup, unless you're a huge conglomerate and can keep people on staff. Why? Because someone, somewhere will have to maintain those changes. Sure, you can then hire someone else to update those changes - but guess what? Most likely, unless they open-source their changes, no LLM will have seen those changes, and even if they are allowed to fine-tune on it, they'll have seen exactly ONE instance of these changes. Odds they'll get everything right, AND the person using the LLM will recognize when it doesn't go right? Oh right, they invested in hundreds of unit tests to ensure everything works as expected even with changes, and I'm the tooth fairy..

blharr · 2025-06-03T06:45:15 1748933115

This just isn't true and will probably never be true. Using all the defaults is... probably optimal in the general sense and when things come to scale, but most companies (or just leadership) at some point want to leave the "standards" with custom design or additions. Also, any company providing payroll/accounting/ software has an inherent interest in going against standardization and providing features to promote lock-in.

kentonv · 2025-06-02T19:43:32 1748893412

There are good arguments to just conform. But it is in fact true nevertheless that many companies and teams continue to choose bespoke workflows over standardized ones. So I guess there must be something driving that.

I don't actually think this is going to take the form of LLMs implementing custom patches to off-the-shelf software. I think instead it's going to look like LLMs writing code that uses APIs offered by off-the-shelf software to script specific workflows.

aerhardt · 2025-06-03T19:15:23 1748978123

I work for SMEs as a consulting CTO, and this is exactly where I see things going in this domain. I can take care of workloads that would've been prohibitively expensive in the past. In the case of SMEs, this may cover critical problems whose resolution can unlock new levels of growth. LLMs can be an absolute boon for them, and I'm fairly optimistic about being able to capitalize on the opportunity.

int_19h · 2025-06-02T17:46:34 1748886394

It's interesting that you bring up accounting software as an example. In jurisdictions where legal requirements around it are a lot more specific than in e.g. US, accounting suites usually already come with a lot of customization hooks (up to and including full-fledged scripting DSLs), and there are software engineers and companies who specialize in using those to implement bespoke accounting requirements.

kentonv · 2025-06-02T17:51:49 1748886709

I admit I have no specific knowledge of accounting and just meant to reference any random department that isn't engineering.

(Though I think it's true of engineering too. We all have our own weird team-specific processes for code reviews and CI and deployments which could probably use better automation.)

But even where lots of customization exists today (such as in engineering!), more is always possible. It's always just a question of whether the automation saves as much time as it took to build. If the automations can be built faster, then it makes sense to build more of them.

intended · 2025-06-03T05:48:20 1748929700

Which solves the now problem for the tomorrow problem.

We assume quite a bit about the challenge when we say it’s getting feature out.

It’s sort of like saying we can sprint faster with these tools, when the race is a marathon.

Or a better example is Coke vs Pepsi.

How do LLMs impact long term project, firm, process viability ?

the_sleaze_ · 2025-06-02T21:55:58 1748901358

Banking allegedly runs on ancient cobalt cathedrals and mystical runes.

Will AI be able translate all that into rust?

mikeocool · 2025-06-02T16:18:43 1748881123

Though arguably cloud infra made it so that a lot more companies who never would have built out a data center or leased a chunk of space in one were spinning up some serious infra in AWS or Azure -- and thus hiring at least 1-2 devops engineers.

Before the end of zero interest rate policy, all the sysadmins I knew who the made the transition to devops were never stuck looking for a job for long.

achierius · 2025-06-02T16:25:13 1748881513

To be clear, the number of people employed as "SREs" or "production engineers" is actually far, far higher (at least an order of magnitude) than in the days before cloud became a thing. There are simply far more apps / companies / businesses / etc. who use cloud hosting than there ever were doing on-prem work.

tkiolp4 · 2025-06-02T23:00:36 1748905236

I don’t think we would need less engineers… the work to be done will increase instead. Example: now it takes 10 engineers to release a product in 10 months without AI. With AI it takes lets say 1 engineer to release the same product in 1 month. What’s the company gonna do now? Release 10 products in 10 months without AI 10 engineers (each using AI).

It’s an exaggeration I know, but you get the point.

motorest · 2025-06-03T04:39:33 1748925573

> What’s the company gonna do now? Release 10 products in 10 months without AI 10 engineers (each using AI).

Software is often not the bottleneck. If instead of 10 engineers you just need the one, the company will shed headcount it doesn't need. This might mean, for example, that instead of 10 developers and a software testing engineer, now a team changes to perhaps add testers while firing half of the developers.

intended · 2025-06-03T05:46:12 1748929572

I’m going to bet that it’s going to need far less AI.

There was another article posted somewhere that made a parallel between the AI hype and no-code, outsourcing and other waves that have come.

paxys · 2025-06-02T15:27:13 1748878033

Increased productivity means increased opportuntity. There isn't going to be a time (at least not anytime soon) when we can all sit back and say "yup, we have accomplished everything there is to do with software and don't need more engineers".

spiderice · 2025-06-02T15:34:23 1748878463

But there very well might be a time very soon where human's no longer offer economic value to the software engineering process. If you could (and currently you can't) pay an AI $10k/year to do what a human could do in a year, why would you pay the human 6 figures? Or even $20k?

Nobody is claiming that human's won't have jobs simply because "we have accomplished everything this is to do". It's that humans will offer zero economic value compared to AI because AI gets so good and so cheap.

paxys · 2025-06-02T15:37:23 1748878643

And there might be a giant asteroid that strikes the earth a few years down the line ending human civilization.

If there is some magic $10k AI that can fully replace a $200k software engineer then I'd love to see it. Until that happens this entire discussion is science fiction.

alastairr · 2025-06-02T20:10:58 1748895058

You don’t need to completely replace a whole 200k engineer. You just need to increase each engineer’s productivity sufficiently that you can reduce the total number of engineers in your company.

motorest · 2025-06-03T06:09:05 1748930945

> If there is some magic $10k AI that can fully replace a $200k software engineer then I'd love to see it.

I think you have multiple offers of that very AI dangling in front of you, but you might be refusing to acknowledge them. One of the problems is the way you opt to frame the issue. Does "replacing" means firing the guy hoping to replace him with a Slack webhook? Or does it mean your team decides they don't need the same headcount of medior/senior engineers because a team of junior engineers mentored by someone focusing on quality ends up being more productive?

spiderice · 2025-06-02T15:51:20 1748879480

If experts were saying the astroid will hit earth in the next 5 years, would it still be science fiction?

You acting like those two scenarios are the same is disingenuous. Fuck that.

lukeschlather · 2025-06-02T17:28:48 1748885328

Experts understand orbital mechanics pretty well. If experts say an asteroid in the next 5 years it's pretty similar to saying that a rock dropped from the top of a skyscraper will hit the ground. It happens billions of times every day, we know the cause and effect.

With AI, there's no real expertise involved in saying "well, it was very stupid 5 years ago, now it's starting to seem smart, if we extrapolate it's going to be smarter than me in 5 years." But no one really knows what level of effort is required to make it smarter than me. No one is an expert in something that doesn't exist yet.

paxys · 2025-06-02T15:55:09 1748879709

Remove all the "experts" who have a major conflict of interest (running AI startups, selling AI courses, wanting to pump their company's stock price by associating with AI) and you'll find that very few actual experts in the field hold this view.

TeMPOraL · 2025-06-02T16:00:13 1748880013

Yup, because it's a stupid view. Good enough AI is right here, right now, today; it's already impacting day-to-day work in the software industry. That one is blindingly obvious to anyone who actually bothers to look around. You don't need experts to tell you the water is wet. It takes something special to try and deny this.

It may not manifest as job loss yet, but the market response to changes is a whole other thing. For one, it's likely to first manifest as slowing down hiring relative to amount of projects being started and then released. Software is a growing market after all.

motorest · 2025-06-03T06:17:14 1748931434

> Remove all the "experts" who have a major conflict of interest (...) and you'll find that very few actual experts in the field hold this view.

You might seek comfort in your conspiracy theories, but back in the real world the likes of me were already quite capable of creating complete and fully working projects from scratch using yesterday's LLMs.

We are talking about afternoons where you grab your coffee, saying to yourself "let's see what this vibecode thing is all about", and challenging yourself to create projects from scratch using nothing but a definition of done, LLM prompts, and a free-tier LLM configured to run in agent mode.

What, then?

You then can proceed to nitpick about code quality and bugs, but I can also say the same thing about your work, which you take far longer to deliver.

TeMPOraL · 2025-06-02T15:56:59 1748879819

It's not. Consider that replacing the only $200k software engineer on the project is different than replacing the third or tenth $200k software engineer on the project. To the extent AI is improving productivity of those engineers, it reduces the need for adding more engineers to that team. That may mean firing some of them, or just not hiring new ones (or fewer of them) as the project expands, as existing ones + AI can keep up with increased workload.

nand_gate · 2025-06-02T20:54:34 1748897674

I'm biased but my money's on the end result of AI being fewer engineers per team but also teams as a concept becoming obsolete.

Why keep legacy structures, with luxuries like POs or PMs if AI becomes powerful as you say - it'll just be 'one man startups' for better or worse.

Any empire-building VP should probably fear the wishful AI future they're praying for!

TeMPOraL · 2025-06-04T07:52:40 1749023560

> it'll just be 'one man startups' for better or worse.

Not necessarily. The reality is, whatever some people can do individually, if they team up, they can do more together. The teams and small startups will remain for now, and so will big companies.

I do imagine however that the internal structure will change. As the AI gets better and able to do more independently, people will shift from pair programming to more of a PM role (this is happening now), and this I imagine will quickly collapse further.

Even today, LLMs seem more suited for project management than doing actual coding - it's just the space in-between that's the problem. I.e. LLMs can code great in the small, and can break down work very well, but keeping the changes consistent and following the plan is where they still struggle. As that gap closes, I'm not really sure how the team composition would look like. But I don't doubt there'd still be teams.

nand_gate · 2025-06-02T20:43:39 1748897019

In this scenario who would be buying this product that offers 'zero economic value compared to AI because AI gets so good and so cheap'.

hooverd · 2025-06-02T15:54:14 1748879654

You run into knowledge collapse because nobody is socially reproducing that knowledge.

amanaplanacanal · 2025-06-02T17:46:57 1748886417

This seems an important thing that somebody should be concerned about. How do we get the next generation of engineers? And how will they even be able to do the senior engineer work of validating the LLM output if they haven't had the years of experience writing code themselves?

tonyhart7 · 2025-06-03T05:33:07 1748928787

well they just need an information archive to learn that knowledge online, no human needed

in software atleast but if you involve in hardware. good things AI cant just replace you outright

lanthissa · 2025-06-02T20:16:13 1748895373

it doesn't even have to be that. software engineer used to be a medium pay job, theres no law of the universe that says it cant go back to that.

simonw · 2025-06-02T21:06:06 1748898366

I guess I have trouble emphasizing with "But what if you only need 2 kentonv's instead of 20 at the end?" because I'm an open source oriented developer.

What's open source for if not allowing 2 developers to achieve projects that previously would have taken 20?

dkdcio · 2025-06-02T15:08:51 1748876931

> The million dollar (perhaps literally) question is – could @kentonv have written this library quicker by himself without any AI help?

I *think* the answer to this is clearly no: or at least, given what we can accomplish today with the tools we have now, and that we are still collectively learning how to effectively use this, there's no way it won't be faster (with effective use) in another 3-6 months to fully-code new solutions with AI. I think it requires a lot of work: well-documented, well-structured codebases with fast built-in feedback loops (good linting/unit tests etc.), but we're heading there no

motorest · 2025-06-03T05:47:26 1748929646

> I think the answer to this is clearly no: or at least, given what we can accomplish today with the tools we have now, and that we are still collectively learning how to effectively use this, there's no way it won't be faster (with effective use) in another 3-6 months to fully-code new solutions with AI.

I think these discussions need to start from another point. The techniques changed radically, and so did the way problems are tackled. It's not that a software engineer is/was unable to deliver a project with/without LLMs. That's a red herring. The key aspects are things like the overall quality of the work being delivered vs how much time it took to reach that level of quality.

For example, one of the primary ways a LLM is used is not to write code at all: it's to explain to you what you are looking at. Whether it's used as a Google substitute or a rubber duck, developers are able to reason with existing projects and even explore approaches and strategies to tackle problem like they were never able to do so. You no longer need to book meetings with a principal engineer to as questions: you just drop a line in Copilot Chat and ask away.

Another critical aspect is that LLMs help you explore options faster, and iterate over them. This allows you to figure out what approach works best for your scenario and adapt to emerging requirements without having to even chat with anyone. This means that, within the timeframe you would deliver the first iteration of a MVP, you can very easily deliver a much more stable project.

james_marks · 2025-06-03T14:08:57 1748959737

Exactly this

> Another critical aspect is that LLMs help you explore options faster, and iterate over them. This allows you to figure out what approach works best for your scenario and adapt to emerging requirements without having to even chat with anyone. This means that, within the timeframe you would deliver the first iteration of a MVP, you can very easily deliver a much more stable project

colonCapitalDee · 2025-06-03T07:48:53 1748936933

I'm had great success with downloading source code and docs and using Claude Code to query them

necovek · 2025-06-03T04:09:24 1748923764

In a "well-documented, well-structured codebase with fast built-in feedback loops", a human programmer is really empowered to make changes fast. This is exactly what's needed for fast iteration, including in unfamiliar codebases.

When you are not introducing a new pattern in the code structure, it's mostly copy-paste and then edit.

But it's also extremely rare, so a pretty high bar to be able to benefit from tools like AI.

bigstrat2003 · 2025-06-02T15:21:24 1748877684

> but rather experienced engineers using AI to generate bits of code and then meticulously testing and reviewing them.

My problem is that (in my experience anyways) this is slower than me just writing the code myself. That's why AI is not a useful tool right now. They only get it right sometimes so it winds up being easier to just do it yourself in the first place. As the saying goes: bad help is worse than no help at all, and AI is bad help right now.

motorest · 2025-06-03T04:33:12 1748925192

> My problem is that (in my experience anyways) this is slower than me just writing the code myself.

In my experience, the only times LLMs slow down your task is when you don't use them effectively. For example, if you provide barely any context or feedback and you prompt a LLM to write you the world, of course it will output unusable results, primarily because it will be forced to interpolate and extrapolate through the missing context.

If you take the time to learn how to gently prompt a LLM into doing what you need, you'll find out it makes you far more productive.

JimDabell · 2025-06-02T15:45:34 1748879134

> My problem is that (in my experience anyways) this is slower than me just writing the code myself.

How much experience do you have writing code vs how much experience do you have prompting using AI though? You have to factor in that these tools are new and everybody is still figuring out how to use them effectively.

imiric · 2025-06-03T04:46:12 1748925972

> You have to factor in that these tools are new and everybody is still figuring out how to use them effectively.

I think that the skills required are highly overblown.

The user should be aware of what each model excels at, its context size, temperature, and other parameters; how to communicate well, set system prompts and phrase tasks in a clear, succinct yet informative way; how to refocus the session when it veers off track; keep up to date with the latest (<~6mo) concepts and tooling, and so on.

All of this is trivial for a competent software engineer. The idea that it requires some specialized training that couldn't be attained by experimentation and reading a blog post is absurd. "Prompt engineering" just isn't a thing.

uludag · 2025-06-02T15:41:00 1748878860

I feel this is on point. So not only is there the time lost correcting and testing AI generated code, but there's also the mental model you build of the code when you write it yourself.

Assuming you want a strong mental model of what the code does and how it works (which you'd use in conversations with stakeholders and architecture discussions for example), writing the code manually, with perhaps minor completion-like AI assistance, may be the optimal approach.

0xbadcafebee · 2025-06-03T12:50:58 1748955058

That's not the million dollar question; anyone who's done any kind of AI coding will tell you it's ridiculously faster. I haven't touched JavaScript, CSS & HTML in like a decade. But I got a whole website created with complex UI interactions in 20 minutes - and no frameworks - by just asking ChatGPT to write stuff for me. And that's the crappy, inefficient way of doing this work. Would have taken me a week to figure out all that. If I'd known how to do it already, and I was very good, perhaps it would have taken the same amount of time? But clearly there is a force-multiplier at work here.

The million dollar question is, what are the unintended, unpredicted consequences of developing this way?

If AI allows me to write code 10x faster, I might end up with 10x more code. Has our ability to review it gotten equally fast? Will the number of bugs multiply? Will there be new classes of bugs? Will we now hire 1 person where we hired 5 before? If that happens, will the 1 person leaving the company become a disaster? How will hiring work (cuz we have such a stellar track record at that...)? Will the changing economics of creating software now make SaaS no longer viable? Or will it make traditional commercial software companies no longer viable? Will the entire global economy change, the way it did with the rise of the first tech industry? Are we seeing a rebirth?

We won't know for sure what the consequences are for a while. But there will be consequences.

motorest · 2025-06-03T04:27:05 1748924825

> This is exactly the direction I expect AI-assisted coding to go in. Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.

There is a middle ground: software engineers being kicked out because now some business person can hand over the task of building the entire OAuth infrastructure to a single inexperienced developer with a Claude account.

petersellers · 2025-06-03T05:14:26 1748927666

I'm not so sure that would work well in practice. How would the inexperienced developer know that the code created by the AI was correct? What if subtle bugs are introduced that the inexperienced developer didn't catch until it went out into production? What if the developer didn't even know how to debug those problems correctly? Would they know that the code they are writing is maintainable and extensible, or are they just going to generate a new layer of code on top of the old one any time they need a new feature?

motorest · 2025-06-03T05:54:31 1748930071

> I'm not so sure that would work well in practice. How would the inexperienced developer know that the code created by the AI was correct?

Not a problem. The industry has evolved to tolerate buggy code that barely works. In fact, in some circles that's what's already expected from the baseline. LLMs change nothing in this regard. In fact, they arguably improve upon this problem as it becomes trivial to implement extensive automated test suites.

> What if subtle bugs are introduced that the inexperienced developer didn't catch until it went out into production?

That's what is happening in the real world without LLMs entering the picture.

petersellers · 2025-06-03T06:06:11 1748930771

I disagree strongly with this conclusion.

I've seen firsthand what happens to large software projects that collapse under their own weight of tech debt. The software literally could not function as intended - customers were lost, the product went under. Low quality being "expected" (which isn't true in my experience, either) is irrelevant when the software doesn't work at all.

The chances of all of that happening are a lot higher with a lone inexperienced engineer at the wheel. You still need experienced engineers to maintain your software, period.

> That's what is happening in the real world without LLMs entering the picture.

The difference is that most firms have experienced software engineers to fix those defects.

sensanaty · 2025-06-03T10:13:56 1748945636

> Low quality being "expected" (which isn't true in my experience, either) is irrelevant when the software doesn't work at all.

Yep, fully agree. We're going through this ourselves at $CURRENT_JOB, where the instability of the platform and product as a whole due to the immensely bad decisions made in the project's past is leading to massive churn from every single customer other than the smallest ones that make us no money anyway.

And it's not just the customers, the devs are feeling it too. There's constant fires and breakages all over the place because management doesn't care to give us any time to focus on quality, and people (myself included) are getting tired of having to read through some 10kLOC monstrosity that not even God Himself could understand, and it's made worse by the clueless management saying "Have you tried having AI find the bugs for you?" like a bunch of brainless sheep being injected with that sweet ol' VC hype machine.

Sure, people will put up with some bugs from time to time, and I'm not even saying I could've or do make perfect choices as well. But there's only so many times people will put up with a broken experience before they cut ties and quit, and in this vibe-coded hallucination world we're entering, are people really going to be okay with the products they use day-in, day-out changing behavior drastically every single day based on whatever the AI decided to hallucinate this time around to "fix" that 1 persistent bug that can't seem to die?

intended · 2025-06-03T05:36:58 1748929018

By then the person who suggested the idea has left the firm.

stackskipton · 2025-06-02T15:51:25 1748879485

>experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them

And where are supposed to get experienced engineers if replaced all Jr Devs with AI? There is a ton of benefit from drudgery of writing classes even if seems like grunt work at the time.

belter · 2025-06-02T15:22:50 1748877770

The million-dollar question is not whether you can review at the speed the model is coding. It is whether you can trust review alone to catch everything.

If a robot assembles cars at lightning speed... but occasionally misaligns a bolt, and your only safeguard is a visual inspection afterward, some defects will roll off the assembly line. Human coders prevent many bugs by thinking during assembly.

pton_xd · 2025-06-02T16:41:35 1748882495

> Human coders prevent many bugs by thinking during assembly.

I'm far from an AI true believer but come on -- human coders write bugs, tons and tons of bugs. According to Peopleware, software has "an average defect density of one to three defects per hundred lines of code"!

belter · 2025-06-03T13:08:14 1748956094

My point is that the bugs generated by LLM or human coders are different.

chrisweekly · 2025-06-02T15:45:27 1748879127

THIS.

IMHO more rigorous test automation (including fuzzing and related techniques) is needed. Actually that holds whether AI is involved or not, but probably more so if it is.

Shorn · 2025-06-03T00:59:40 1748912380

And yet, doors still fall off airplanes without any AI in sight.

jstummbillig · 2025-06-03T14:58:02 1748962682

This is not where AI-assisted coding is going. Where it is going is: The AI will quickly become better at avoiding these types of mistakes than humans ever were (and are ever going to be), because they can and thus will be RL'ed away. What will be left standing longest is providing the vision wrt what the actual problem is, you want to solve.

danans · 2025-06-02T15:39:38 1748878778

> Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X)

The theory of enshittification says that "business person pressing a few buttons" approach will be pursued, even if it lowers quality, to save costs, at least until that approach undermines quality so much that it undermines the business model. However, nobody knows how much quality tradeoff tolerance is there to mine.

tkiolp4 · 2025-06-02T22:50:33 1748904633

Why is speed important in this context? If the code is published one week/month later, would that affect what exactly? It’s open source.

kentonv · 2025-06-02T23:08:12 1748905692

As it happens, if this were released a month later, it would have been a huge loss for us.

This OAuth library is a core component of the Workers Remote MCP framework, which we managed to ship the day before the Remote MCP standard dropped.

And because we were there and ready for customers right at the beginning, a whole lot of people ended up building their MCP servers on us, including some big names:

https://blog.cloudflare.com/mcp-demo-day/

(Also if I had spent a month on this instead of a few days, that would be a month I wasn't spending on other things, and I have kind of a lot to do...)

kypro · 2025-06-03T08:43:48 1748940228

> Not software engineers being kicked out and some business person pressing a few buttons to have a fully functional app (as is playing out in a lot of fantasies on LinkedIn & X), but rather experienced engineers using AI to generate bits of code and then meticulously reviewing and testing them.

Why would a human review the code in a few years when AI is far better than the average senior developer? Wouldn't that be as stupid as a human reviewing stockfish's moves in Chess?

hooverd · 2025-06-02T15:52:27 1748879547

AI is great for undifferentiated heavy lifting and surfacing knowledge, but by the time I've made all the decisions, I can just write the code that matters myself there.

c-linkage · 2025-06-02T15:38:05 1748878685

I very much appreciate the fact that the OP posted not just the code developed by AI but also posted the prompts.

I have tried to develop some code (typically non-web-based code) with LLMs but never seem to get very far before the hallucinations kick in and drive me mad. Given how many other people claim to have success, I figure maybe I'm just not writing the prompts correctly.

Getting a chance to see the prompts shows I'm not actually that far off.

Perhaps the LLMs don't work great for me because the problems I'm working on a somewhat obscure (currently reverse engineering SAP ABAP code to make a .NET implementation on data hosted in Snowflake) and often quite novel (I'm sure there is an OpenAuth implementation on gitbub somewhere from which the LLM can crib).

8-prime · 2025-06-03T07:44:27 1748936667

This is something that I have noticed as well. As soon as you venture into somewhat obscure fields, the output quality of LLMs drastically drops in my experience.

Side note, reverse engineering SAP ABAP sounds torturous.

Vicinity9635 · 2025-06-03T23:16:52 1748992612

The worst part isn't even that the quality drops off, it's that the quality drops off but the tone of the responses don't. So hallucinations can start and it's just confidently wrong or even dangerous code and the only way to know better is to be better than the LLM in the first place.

They might surpass us someday, but we aren't there yet.

theshrike79 · 2025-06-03T11:50:09 1748951409

The usual solution is a multi-tiered one.

First you use any LLM with a large context to write down the plan - preferably in a markdown file with checkboxes "- [ ] Task 1"

Then you can iterate on the plan and ask another LLM more focused on the subject matter to do the tasks one by one, which allows it to work without too much hallucination as the context is more focused.

mtlynch · 2025-06-02T15:07:09 1748876829

>In all seriousness, two months ago (January 2025), I (@kentonv) would have agreed.

I'm confused by "I (@kentonv)" means here because kentonv is a different user.[0] Are you saying this is your alt? Or is this a typo/misunderstanding?

Edit: Figured out that most of your post is quoting the README. Consider using > and * characters to clarify.

[0] https://news.ycombinator.com/user?id=kentonv

kentonv · 2025-06-02T15:09:01 1748876941

He is quoting from the project readme. I wrote all this text.

mdaniel · 2025-06-02T15:11:21 1748877081

Thanks for weighing in here

If I might make a suggestion, based on how fast things change, even within a model family, you may benefit from saying Claude what. I was especially cognizant of this given the recent v4 release which (of course) hailed as the second coming. Regardless, you may want to update your readme to say

It may also be wildly out of scope for including in a project's readme, but knowing which of the bazillions of coding tools you used would also help a tiny bit with this reproduction crises found in every single one of these style threads

kentonv · 2025-06-02T15:21:41 1748877701

I believe it's important to say when AI was used so heavily in building a library -- it would feel dishonest to me to claim I wrote it all myself. I also think it's just a pretty interesting thing to know about. So I think it belongs in the readme. (But I'm not making a moral judgment on what anyone else does.)

It was almost entirely Claude Sonnet 3.7. I agree I should add the version to the readme.

pera · 2025-06-02T17:48:45 1748886525

That's interesting. My experience with Sonnet 3.7 early this year was pretty poor: It simply couldn't reach the correct solution alone, even when explaining the issues explicitly. The proposed invalid solution was not too far from the correct one, so you could fix it manually if you knew what you were doing, but then the way the code was structured was not something that I would like to maintain in a real project. All this on top of the usual UX issues like hallucinated APIs. The experience refactoring was even worse.

I guess your mileage is highly dependent on the domain of your problem? In my case was GIS by the way

diggan · 2025-06-02T15:13:49 1748877229

> It may also be wildly out of scope for including in a project's readme

The entire point of the repository seems to be to invalidate/validate the thesis if LLMs are good enough to be pair programmers right now. Removing it from the README makes no sense in that context.

kentonv · 2025-06-02T15:22:51 1748877771

This library is a core component of our MCP framework, it's not just an experiment.

mdaniel · 2025-06-02T15:18:30 1748877510

I did consider that, but the repo isn't called "kentonv does a yolo" it's straight-up labeled as a provider library for CF workers under Cloudflare's brand

Some hair splitting about whether including the Claude stanza is "full disclosure," or "AI advocacy," or just because it's cool

Anyway, I mentioned the out of scope because if half the readme is about correct usage of the library, and half is about the sausage making, I'd be confused as a reader about whether this was designed to be for real or for funzies

uludag · 2025-06-02T15:30:17 1748878217

I found it pretty strange to include in the readme as well. Like, imagine someone relied on fiverr or codementor.io to write this code. It'd be weird to say in the readme "I was fairly skeptical that I could get quality code written on Fiverr, but I tried it and it turns out it was pretty good!"

My guess is there were some push to doing anything related to AI at the company. I feel a lot of companies are doing this these days.