Imagine a Rorschach Test of language, where a certain set of non-recognizable-la...

wongarsu · 2025-04-20T18:56:24 1745175384

> Imagine a Rorschach Test of language, where a certain set of non-recognizable-language tokens invariably causes an LLM to talk about flowers. These strings exist by necessity due to how the LLM's layers are formed.

Maybe not for humanity as a species, but for individual humans there are absolutely token sequences that lead them to talk about certain topics, and nobody being able to bring them back to topic. Now you'd probably say those are recognizable token sequences, but do we have a fair process to decide what's recognizable that isn't inherently biased towards making humans the only rational actor?

I'm not contending at all that LLMs are only built on language. Their lack of physical reference point is sometimes laughably obvious. We could argue whether there are signs they also form a world model and reasoning that abstracts from language alone, but that's not even my point. My point is rather that any test or argument that attempts to say that LLMs can't "reason" or "assume" or whatever has to be a test a human could pass. Preferably a test a random human would pass with flying colors.

og_kalu · 2025-04-20T18:46:04 1745174764

I think you are begging the question here.

For one thing, LLMs absolutely form responses from conceptual meanings. This has been demonstrated empirically multiple times now including again by anthropic only a few weeks ago. 'Language' is just the input and output, the first and last few layers of the model.

So okay, there exists some set of 'gibberish' tokens that will elicit meaningful responses from LLMs. How does your conclusion - "Therefore, LLMs don't understand" fit the bill here? You would also conclude that humans have no understanding of what they see because of the Rorschach test ?

>There exists no similar set of tokens for humans, because our process is to parse the incoming sounds into words, use grammar to extract conceptual meaning from those words, and then shape a response from that conceptual meaning.

Grammar is useful fiction, an incomplete model of a demonstrably probabilistic process. We don't use 'grammar' to do anything.