SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

Hans Ole Hatzel, Ekaterina Artemova, Haimo Paul Stiemer, Evelyn Gius, Chris Biemann|April 23, 2026arXiv

Key Takeaway

Narrative similarity can be operationalized as a practical classification task, and LLM ensembles currently outperform other approaches, but there's significant room for improvement in how systems represent and compare story meanings.

Summary

This paper introduces a shared task on narrative similarity that asks systems to determine which of two stories is more similar to a reference story. The team collected over 1,000 annotated story triples and evaluated 71 submissions from 46 teams, finding that LLM ensembles performed best for classification while fine-tuned embedding models competed well with simpler approaches.

evaluation data

Key Terms

narrative-structure embedding-representation inter-annotator-agreement