Long Context Retrieval
Tests models on retrieving specific information from very long documents, measuring long-context comprehension and retrieval accuracy
Models must locate and extract specific facts, figures, or passages from long documents (10K–1M tokens). Tests robustness of attention mechanisms and context utilisation at extended lengths.
No model scores recorded yet
Scores will appear here as the pipeline processes model data