By analyzing which neurons activate during model predictions, you can automatically select better training data and improve self-supervised learning without any human annotations—useful when expert labels are expensive or unavailable.
This paper proposes Neuron-OPSD, a method for improving large language models without human labels by using the model's internal neuron activations to select which training examples to learn from and how to construct better teacher models. The approach trains the model on its own predictions, achieving better performance on specialized tasks while maintaining general knowledge.