The ability of a model to process and understand very long sequences of text while maintaining coherence across distant parts of the input.