LLMs can transfer knowledge to new scenarios but struggle with longer problem horizons due to recursive instability—a fundamental limitation that training and inference tricks cannot fully overcome.
This paper tests whether language models can solve problems in new situations by using shortest-path planning as a controlled test case. The researchers find that models handle new maps well but fail when problems get longer, due to instability in how they process sequences.