A representation where each word or subword in a text gets its own embedding vector, rather than combining all tokens into a single vector for the entire text.
Performance retention over long documents and conversations