UniPool: A Globally Shared Expert Pool for Mixture-of-Experts

Minbin Huang, Han Shi, Chuanyang Zheng, Yimeng Wu, Guoxuan Chen et al.|May 7, 2026arXiv

Key Takeaway

You don't need separate expert sets per layer in MoE models—a shared expert pool with independent routers works better and uses fewer parameters, suggesting the standard per-layer expert allocation is unnecessarily wasteful.

Summary

UniPool replaces the standard Mixture-of-Experts design where each layer has its own expert set with a single shared pool of experts accessed by all layers. This reduces redundancy and allows expert parameters to grow sublinearly with model depth while improving performance and reducing parameter count by 30-60% compared to standard MoE.

architecture efficiency scaling

Key Terms

mixture-of-experts sparse-activation expert-routing load-balancing