Radial Suppression Accelerates Algorithmic Generalization: A Geometric Analysis of Delayed Generalization

Srijan Tiwari, Aditya Chauhan, Manjot Singh|June 30, 2026arXiv

Key Takeaway

Penalizing radial expansion of neural network activations forces learning of compact, structured representations and dramatically speeds up generalization on algorithmic tasks—a simple geometric insight with practical training benefits.

Summary

Neural networks memorize before generalizing on algorithmic tasks because hidden representations inflate radially during training. This paper proposes a geometric penalty that constrains activations to a hypersphere, forcing the network to learn structured circuits faster—accelerating grokking 6x on arithmetic tasks and halving training time for addition.

training reasoning efficiency

Key Terms

grokking radial-angular-decomposition memorization-generalization-delay activation-space-dynamics