Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

Fatema Siddika, Md Anwar Hossen, Tanwi Mallick, Ali Jannesari|June 5, 2026arXiv

Key Takeaway

By separating task-specific experts from shared experts with adaptive routing, SETA solves catastrophic forgetting in continual learning without sacrificing performance on new tasks—useful for deploying LLMs that need to learn from multiple domains over time.

Summary

SETA is a continual learning framework that prevents LLMs from forgetting old knowledge while learning new tasks by splitting model parameters into task-specific and shared expert modules. Instead of all tasks competing for the same weights, the method uses sparse subspace decomposition to isolate what's unique to each task while preserving shared capabilities across tasks.

training efficiency architecture

Key Terms

catastrophic-forgetting plasticity-stability-dilemma sparse-expert-routing continual-learning backward-transfer