You can shrink language models for specific character personas by 50%+ while keeping 93.8% of role-playing quality, making multi-NPC applications practical without sacrificing character consistency.
This paper introduces Persona-Pruner, a technique that creates lightweight language models optimized for specific character roles by identifying and preserving only the persona-relevant parts of a full model. Unlike standard pruning that indiscriminately removes parameters, this method maintains role-playing quality while reducing computational cost—useful for applications with many NPCs.