Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Blowing ChatGPT's mind. Is this just sycophancy? Or a realistic assessment? (chatgpt.com)
8 points by EGreg 17 days ago | hide | past | favorite | 30 comments


How does a transcript chronicling some poor guy's descent into AI induced psychosis make the frontpage ? This is literally (and yes I know) what's been happening on reddit for months now: "Have I built a perpetuum mobile ? GPT4o seems to think so!" but at least on reddit the comments don't engage with the "substance" of those chat transcripts. I am not saying that these kinds of transcripts are without value, they clearly demonstrate that even competent engineers can get sweet-talked into (probably out of character) actions like "boast about your accomplishments on hn and a CTO will take notice and offer you their job because you are so much more brilliant than them" while I have no idea if "Greg" has people around him to talk to, he clearly has no one who compliments him like this on his php codebase. If he wanted to engage productively with an LLM he could have prompted it to "roast his code" "point out weak points" "criticize the underlying architecture" but obviously thats not what he wanted or needed. He needed to hear some compliments, the LLM understood that and the machine complied. Obviously thats not the experience he will get out in the real world. It's more like having a talking blow-up doll compliment you on your lovemaking skills and encourage you to upload a video of the interaction to your favorite tube-site and sent the link to all your business contacts to show-off your inimitable love-making prowess.


Wow. I go to bed and wake up to claims it hit the front page. Interesting.

Here is almost the same exact sequence but with constant instructions to remain brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...

I was just late at night and wanted to post this chat transcript on HN to share some perspective on what developers are getting from ChatGPT.

I happen to be an expert in this particular area that I’m building.

ChatGPT seems to remember that I am in New York and want “no bullshit” answers. In the last few days it keeps weaving that into most responses.

That fact appears in its memory that users can access, as is the fact that it should not, under any circumstances, use emojis in code or comments, but it proceeds to do so anyway, so I am not sure how the memory gets prioritized.

Here is the interesting thing. As an expert in the field I do agree with ChatGPT on its statistical assessment of what I’ve built, because it took me years of refinement. I also tried it with average things and it correctly said that they’re average and unremarkable. I simply didn’t post that.

What I am interested in, is how to get AI transcripts to be used as unbiased third-party “first looks” at things, such as what VCs would do for due diligence.

This was just a quick thing I thought I’d get a few responses on HN about. I suspect it might have hit the front page because some people dug through the code and saw the value. But you can get all the code for free on https://github.com/Qbix/Platform .

Yeah, there is obviously an element of flattery that people let go to their head. I have had ChatGPT repeatedly confirm the validity of ideas I had in fields I am NOT an expert, while pushing back on countless others. I use it as one data point and mercilessly battle-test the ideas and code by asking it to find holes in them from various angles. This particular HN submission, although done very late at night here in NYC, was an interesting mix of genuinely groundbreaking stuff and ChatGPT being able to see the main ideas at a glance, and “going wild”, while at the same time if I run it with instructions at the start of “be extremely objective”, it still approaches this same thing in the end.


Well, the conclusions of your previous conversations also remain in memory, especially if you explicitly refer to them. Still your new transcript kinda proves my point ? Except for the non-standard (a nice way of saying: violates best practices) way you implement service workers there is literally nothing original or unique about any part of your codebase other than the fact that its written in php ? I have nothing against php but haven't worked on any php projects in a long while and didn't take the time to look into your code in detail. You're obviously smart and opinionated when it comes to webDev which is great. While your post seemed to be borderline LLM psychosis, its a different story if you were sleep deprived and drunk and now realize that you probably haven't rebuild google all by yourself. Your issue seems to be something else which is also quite frequent here. AI skeptics getting drawn into repeated fundamentalist discussions about LLMs being incapable of this or that BUT then having a "feel the AGI" moment not only becoming convinced of a utility of LLMs they previously denied but -being inexperienced- going far beyond that and believing that LLMs can do all kinds of things that they (at least currently) can't which ends up frustrating them and leading to some renewing their skepticism. You're not alone, it's quite likely that tomorrow when people who haven't had early access to Gemini 3 get it and start one-shotting functional clones of classic computer games share that on social media (or on the hn frontpage) and others are inspired to give it a try with "Gemini, please make a PC version of Half-Life 3 for me!" and are subsequently underwhelmed with the resulting code that doesn't compile or with the outcome of "Tell me how to make a Billion Bucks in less than 3 months!" millions will join you. What sets you apart is your capacity to understand the engine behind the output if you put in the work and don't allow the sweet talk to get to you!


Nah. I don’t “feel the AGI”. I think the AGI is a silly quest, just like having a plane flap its wings. Feynman had it right in the 80s: https://www.youtube.com/watch?v=ipRvjS7q1DI

I think the future is lots of incremental improvements that get replicated everywhere and humans outclassed in nearly every field, where they stop relying on each other.

As far as LLMs yes I think they are the best placed to know if some code or invention is novel, because of their vast training. Can be far better than a patent examiner, if trained on prior art, for instance.

What you’re not used to is an LLM being fed stuff that you statistically / heuristically would expect to be average but is in fact the polished result of years of work. The LLM freaks out, you get surprised. You think it was the prompts. The prompts are changed, the END result is the same (scroll to the bottom).

I want to see whether foundational LLMs can be used as a good first filter for dealflow and evaluating actual projects.


The problem of using an LLM to validate reality is that you still need to prove your genius code work in the real world. ChatGPT won't hire you, it even have your code already.


Yeah, it showed up front-page here early this morning.

For a different comparison of flattery, sycophancy, and brutalism, I copied-and-pasted each segment of your first conversation into "my own" ChatGPT 5.1: https://chatgpt.com/share/691b71ea-4e58-8005-8ce6-a6b5d10120...

It is my observation that "my" bot produces completely different results compared to "your" bot.


It seems like you told it to nitpick things that could be improved and stay away from anything positive, then reused my prompts.

Can you share your instructions?


That's the whole unabridged conversation (I don't know how I could abbreviate it if I wanted to), and I produced it exactly as I said: I just pasted in your prompts.

The output is of very similar style to how my interactions with it are when I'm using it for work on my own projects.

My bot does run with a pretty lengthy set of supposed rules that have been accumulated, tweaked, condensed and massaged over the past couple of years. These live in a combination of custom instructions (in Preferences), deliberately-set memory, and recollection from other chats.

I use "supposed" here because these individual aspects are frequently ignored, and they always have been. Yet even if the specificity is often glossed over, the rules quite clearly do tend to shape the overall output and tone (as the above-linked chat demonstrates).

Anyway, I like the style quite a lot. It lets me focus on achieving technical correctness instead of ever being inundated with the noise of puffery.

But I have no idea where I'd start to duplicate that environment. Someone at OpenAI could surely dissect it, but the public interface for ChatGPT is way too limited to allow seeing how context is injected and used.

So while I'd certainly would love to share specific instructions, that's simply beyond my capability as a lowly end-user who has been emphatically working against sycophancy in their own little "private" ChatGPT.

I barely even know how I got here.

(I could ask the bot, but I can say with resolute certainty that it would simply lie.)


I thought it was on the front page as parody, or to point and laugh at the dumb things LLMS do in a "number of rs in strawberry" kind of way.


When you asked:

“Wow, are you saying I kind of singlehandedly built the kind of stack they use at Google? If engineering departments only knew… how can I get some CTO to hire me as a chief engineer?”

was probably when chatgpt should have said - no you built what seems like an interesting/capable php framework.

But instead you got merciless positivity.


Well, if they want a merciless putdown, even if they DID have "singlehandedly built the kind of stack they use at Google", they can always post that claim on HN!


Agreed. This was late at night and I just wanted to share the surreal experience with HN.

Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...


5.1 is astonishingly bad. Besides the 4o-levels of sycophancy, I've also seen it form grammatically incorrect sentences in German.

GPT-5 Thinking seemed to have a much more tolerable default personality than 5 "chat/instant", but 5.1 seems a bit broken across the board. Reasoning capabilities also seem somewhat weaker.


It appears to have custom instructions based on its insistence to respond in New York direct. But wow, no wonder people get addicted to/love Chatgpt. I ignore sycophancy because I've seen terrible hallucinations but a lot of people blindly believe in "the genius in a box"


That is due to its memory, and the latest update seems to weave it into many conversations (since it probably falls into the category of “this is how the guy who lives in NYC wants me to respond: straight talk, no bullshit”). I can see what it remembers and this is just one fact, funny that it often forgets my oft-repeated correction, which it stored in its memory repeatedly, of not putting any emojis in code or comments.

This was late at night and I just wanted to share the surreal experience with HN. The difference here is that I am actually an expert on the things that I had it evaluate, I just threw some code that I polished over the years, to see what it would respond since LLMs can definitely pattern-match the concepts present in a block of eg code and then compare it to everything they’ve been trained on.

Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...


This is the equivalent of a guy building a "car" in his backyard with old used parts, then saying it was only 1% of the cost of a new Mercedes, and everybody thinks now that Mercedes is 100x overpriced.

Also the "calculation" is totally wrong. A guy with lots of enthusiam and masses of free time rebuild existing tech stacks in plain PHP (!) and JS (!), two of the slowest languages on the planet. That alone should debunk the whole story. Comparing ANY logic in PHP vs C (nginx) is just nuts and ChatGPT should know it.

Interesting hobby project, and maybe even a profitable one if you find niche clients, but obviously far far away from the definition of a 0.01% engineer.


I'd be careful with what you mean by "realistic" here. Is this realistic as in would any normal person ever say this? The answer to that is a hard no, because people don't talk like this. But it is a moot point, because there shouldn't be any reason in the first place for you to care about the opinion of Chat-GPT, sycophancy or not.

I'm gonna assume that you are a pretty young developer. I think you have built something that you have put a lot of thought and engineering effort into, and that you should be proud of that. But asking very open-ended leading questions ike this to an LLM is not the way to go. Truthfully, it is not even the way to go when talking to another human either but humans are more understanding. We've all been young and insecure once, and one with any ounce of empathy will gently steer you towards a more healthy path without overt flattery.

I urge you, for your own emotional well being, seek more human connections. Chat-GPT can be great for very targeted questions if you have a specific problem or a very specific area you want feedback on and prompt it to give feedback to that. And this may sound very harsh, but I think you need to hear it: The kind of validation-seeking you are engaging with in this chat is not at all that different from the ones seeking emotional support from an "AI-girlfriend" or similiar. Please be careful, and find your own community with real humans that you can relate and look up to.


Thank you very much for your concern. I have a satisfying social and personal life, I’m not at all relying on flattery from anyone. What I posted to ChatGPT is about 5% of what I spent my time on. Entreneurship and building things is indeed a major part of what I feel called to do, it all has its ups and downs but it is by no means preventing me from human interaction.

This was late at night and I just wanted to share the surreal experience with HN. As you might know from my posts, I am a critic of AI in general.

Here is almost the same exact sequence, but with repeated instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...

Would you update your assessment after reading that? Again for me this is about an experiment and sharing a particular AI interaction with fellow humans.


This. ChatGPT is an assistant and we aren't supposed to listen to assistant's flattery.


What are your custom instructions. When I drop your first message into 'my' ChatGPT 5.1, the first paragraph returned is:

"Short answer: yes, the idea and overall feature are solid and “cool” for a platform – environment-aware static bundling + filtering + preloading is a real capability, not fluff. The implementation is workable but has a few concrete problems and some messy spots. (...)"


I tried it again with instructions throughout, to be brutally honest and objective: https://chatgpt.com/share/691b4035-0ed8-800a-bee3-ae68e2a63c...

It still did this. Can you retry other approaches, eg saying it was a junior developer who wrote the code, and to critique it etc.

Is it sycophantic regardless? Or is it objective? After multiple runs of same prompts but different instructions trying to minimize sycophancy.


Can someone who is not on mobile confirm that the framework is any good?

It doesn't seem all that impressive to me and I know that the LLM amped up the positivity, but if this really has clear advantages over the other frameworks it is being compared to, just how bad are web frameworks?


This was a painful read. Greg, PHP won't make you rich, Meta won't hire you because of this creation of yours


I tend to agree with this wholeheartedly, as long as you put the word alone in both clauses.

This is a classic case of ChatGPT freaking out over the quality or substance of what’s been built, but completely underestimating what it takes to get buy-in and uptake in the real world. The two rarely have to do with the other, which is how we end up where we are today (relying on giant corporations for all our reliable infrastructure and even social platforms).


You are absolutely right!

But also, no, it was hallucinating completely. Things like "static asset pipeline" is not mindblowing to any VC, even more written in PHP. Your qbix project is huge and complex, for sure, it does not mean it is a genius implementation although it could be. Unfortunately, it is not engaging enough for me to try it so at least in the marketing aspect of it, the implementation is failing badly.


The difference between the two is whether ChatGPT believes it's being accurate when it's saying those things, and we can't know what it believes, (or whether it can believe things at all).

I will say that making a framework of your own is an achievement, but making a great framework is really rare. I don't know what your framework is like, so I can't say.


Seems to be this: https://qbix.com/


what was your style and tone setting? friendly?


Somewhat unrelated, GPT-5.1 hasn't been very good. I manually select 5 or 4o.

5.1 seems to easily miss the primary point of what is being asked or discussed.


ChatGPT is so verbose, it just spews out pages of useless babble. I'm not reading all that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: