I think some earlier NLP applications have something called "Unknown token", whi...

stormfather · on Jan 4, 2024

This helped me understand but not well enough to explain it yet: https://transformer-circuits.pub/2022/in-context-learning-an...

heisenburgzero · on Jan 5, 2024

Thanks. "In-context learning" was the phrase I was looking for.

Also, I found these 2 links pretty good too. 1. http://ai.stanford.edu/blog/understanding-incontext/ 2. http://ai.stanford.edu/blog/in-context-learning/

I'm still not completely convinced. Probably need to dwell on the topic longer.

stevenhuang · on Jan 4, 2024

Everything falls into place once you understand that LLMs are indeed learning hierarchical concepts inherent in the structured data it has been trained on. These concepts exist in a high dimensional latent space. Within this space is the concept of nonsense/gibberish/placeholder, which your sequence of unseen tokens map to. Then it combines this with the concept of SQL tables, resulting in hopefully the intended answer.