How Width and Data Shape Generalization Scaling Laws in Quadratic Neural Networks

Julius Girardin, Emanuele Troiani, Yizhou Xu, Vittorio Erba, Florent Krzakala et al.|June 26, 2026arXiv

Key Takeaway

Generalization doesn't scale uniformly with width and data—the relationship changes dramatically across different regimes, with the data's spectral structure determining how performance improves.

Summary

This paper analyzes how neural networks generalize as both model size and training data scale together. Using a simplified quadratic network model with structured data, the researchers derive exact formulas showing that generalization error follows different power-law patterns depending on the ratio of parameters to samples, revealing distinct phases like interpolation onset.

scaling training

Key Terms

scaling-laws generalization feature-learning interpolation spectral-properties