The maximum number of tokens a model can process in a single conversation or prompt.
Performance retention over long documents and conversations