I experimented with coding with GPT a few times now.
GPT is legitimately interesting as an alternative search interface to StackOverflow. I've found that 15 to 45 minutes of searching with Google/StackOverflow can be reduced to just 5 minutes with GPT.
But beyond that, it's been very disappointing. Whenever I've tried getting it to write something even slightly non-trivial (i.e. tougher than just copy/pasting from online documentation or an answer on StackOverflow) it's produced code that is horribly broken, but where the flaws are subtle enough that they might not pop out right away to a novice programmer. It has consistently struggled with programming problems that I would rate at 4/10 or 5/10 difficulty.
Most of the code I write is fairly trivial, but it's glue code that is highly specific to my particular code base, so GPT isn't helpful because it doesn't know about my codebase, and if you try to copy/paste your large chunks of your codebase into the prompt it runs into issues with forgetting.
And GPT isn't helpful for the non-trivial parts of my code either (as mentioned above). So what's left?
So when I see people say that it 10x'ed their productivity, I wonder if they exclusively write very trivial code that is effectively copy/paste from Stack Overflow or if they've allowed GPT to fill their codebases with flawed code without realizing it.
Maybe future iterations of GPT will get it. GPT4 is definitely not there yet.
Have you tried prompting it with a detailed explanation of requirements?
I usually prompt it with short questions, but I recently saw a video where the person provided a lengthy (50 - 150 words) prompts detailing the requirements for what they were requesting. I was shocked at the results. (Still required iterations of corrections/modifications though)
I haven’t tried it myself yet, but it’s an avenue that might yield better results than what you’ve experienced — perhaps even vastly better results.
It doesn’t need to know about your specific code base if you have it write a function for the task you’re prompting it for. You can copy/paste the function into your code and run the function with arguments that are specific to your code base.
You can even prompt it to write a unit test for the function it wrote.
As for your concerns about code errors, I’ve found that you have to approach the coding support as an iterative process, where you request code, and then ask it to improve or correct the result it just gave. You can even prompt it to check its previous result for errors.
It’s odd how we have such different experiences. As an example yesterday I had GPT4 take an existing python script, clean it up, add timing statistics and progress bars. 10 mins works. Would have taken me probably 1 hour.
I do think you have a point about it being most useful for trivial or boilerplate problems but it’s still very useful as there is always much of that.
It's like they said, it depends a lot on what kind of work you're doing and whether you are working with public frameworks or internal APIs. If you're just banging out some code that is very similar to what other people write using very well-known APIs, then it's fantastic. If you need to debug a large complex code base, then it doesn't currently have enough capacity to understand and retain all the information required. It can't do many other things you would need as well.
I'm currently writing a demo that I'll present next week using Jetpack Compose, and it's a UI toolkit I'm not really familiar with, so it's been really helpful for that. In fact, I have a tool that is almost like a build system that compiles English language spec files down to the code, and then allows me to edit them and to continue to work with the AI by just changing the spec and the code simultaneously. That's been really tremendously effective, especially with GPT-4.
On the other hand, for working on my main product, it's pretty useless because all of that work is debugging and making a lot of small changes all over the code base, which is too advanced for it currently. And I think that will get solved, but it isn't solved yet.
BTW the above paragraphs were dictated using the Whisper API. I didn't change a single thing about it. Whisper is just as impressive and useful as the LLMs, in my view.
GPT is legitimately interesting as an alternative search interface to StackOverflow. I've found that 15 to 45 minutes of searching with Google/StackOverflow can be reduced to just 5 minutes with GPT.
But beyond that, it's been very disappointing. Whenever I've tried getting it to write something even slightly non-trivial (i.e. tougher than just copy/pasting from online documentation or an answer on StackOverflow) it's produced code that is horribly broken, but where the flaws are subtle enough that they might not pop out right away to a novice programmer. It has consistently struggled with programming problems that I would rate at 4/10 or 5/10 difficulty.
Most of the code I write is fairly trivial, but it's glue code that is highly specific to my particular code base, so GPT isn't helpful because it doesn't know about my codebase, and if you try to copy/paste your large chunks of your codebase into the prompt it runs into issues with forgetting.
And GPT isn't helpful for the non-trivial parts of my code either (as mentioned above). So what's left?
So when I see people say that it 10x'ed their productivity, I wonder if they exclusively write very trivial code that is effectively copy/paste from Stack Overflow or if they've allowed GPT to fill their codebases with flawed code without realizing it.
Maybe future iterations of GPT will get it. GPT4 is definitely not there yet.