Generalization in LLM Problem Solving: The Case of the Shortest Path

Yao Tong, Jiayuan Ye, Anastasia Borovykh, Reza Shokri|April 16, 2026arXiv

Key Takeaway

LLMs can transfer knowledge to new scenarios but struggle with longer problem horizons due to recursive instability—a fundamental limitation that training and inference tricks cannot fully overcome.

Summary

This paper tests whether language models can solve problems in new situations by using shortest-path planning as a controlled test case. The researchers find that models handle new maps well but fail when problems get longer, due to instability in how they process sequences.

reasoning evaluation

Key Terms

length-scaling spatial-transfer recursive-instability systematic-generalization inference-time-scaling