By separating task-specific experts from shared experts with adaptive routing, SETA solves catastrophic forgetting in continual learning without sacrificing performance on new tasks—useful for deploying LLMs that need to learn from multiple domains over time.
SETA is a continual learning framework that prevents LLMs from forgetting old knowledge while learning new tasks by splitting model parameters into task-specific and shared expert modules. Instead of all tasks competing for the same weights, the method uses sparse subspace decomposition to isolate what's unique to each task while preserving shared capabilities across tasks.