LLMs have a hidden logical reasoning layer that works the same way whether reasoning in English or symbolic notation—you can exploit this to improve reasoning by steering the model toward this shared space without retraining.
This paper discovers that LLMs contain a shared internal logical reasoning space that aligns natural language and symbolic reasoning. By analyzing how the model's internal activations correlate across both reasoning styles, researchers created a method to steer the model toward better logical reasoning without additional training, improving accuracy on reasoning tasks by up to 11%.