> LLMs can't think. They are generating tokens one at a time Huh? They are gener...

> LLMs can't think. They are generating tokens one at a time

Huh? They are generating tokens one at a time - sure that's true. But who's shown that predicting tokens one at a time precludes thinking?

It's been shown that the models plan ahead, i.e. think more than just one token forward. [1]

How do you explain the world models that have been detected in LLMs? E.g. OthelloGPT [2] is just given sequences of games to train on, but it has been shown that the model learns to have an internal representation of the game. Same with ChessGPT [3].

For tasks like this, (and with words), real thought is required to predict the next token well; e.g. if you don't understand chess to the level of Magnus Carlsen, how are you going to predict Magnus Carlsen's next move...

...You wouldn't be able to, even just from looking at his previous games; you'd have to actually understand chess, and think about what would be a good move, (and in his style).

[1] https://www.anthropic.com/research/tracing-thoughts-language...

[2] https://www.neelnanda.io/mechanistic-interpretability/othell...

[3] https://adamkarvonen.github.io/machine_learning/2024/01/03/c...