Thinking with Reasoning Skills: Fewer Tokens, More Accuracy

Guangxiang Zhao, Qilong Shi, Xusen Xiao, Xiangzheng Zhang, Tong Yang et al.|April 23, 2026arXiv

Key Takeaway

By retrieving learned reasoning skills at inference time instead of reasoning from scratch, you can reduce token usage and improve accuracy—making LLM reasoning cheaper and faster for practical deployment.

Summary

This paper proposes storing reusable reasoning skills learned from past problem-solving attempts, then retrieving and applying them during inference to guide new reasoning. Instead of reasoning from scratch each time, the model recalls relevant skills to avoid redundant work and reach solutions faster. Tests on coding and math tasks show it uses fewer tokens while improving accuracy.

reasoning efficiency training

Key Terms

chain-of-thought reasoning-skill inference-time-compute token-efficiency