Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe language models usually use 2-byte (16 bit) tokens, which corresponds to an embedding dimension of 2^16=65536. With 400 bytes per token this would be 2^(400*8), which is an extremely large number. Way too large to be practical, I assume.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: