When pruning training data with label noise, tracking how loss changes over time is more reliable than using instantaneous loss values for identifying bad samples.
AlignPrune is a plug-and-play module that improves dynamic data pruning when training data contains mislabeled examples. Instead of relying on individual sample loss values (which can be misleading with noisy labels), it tracks how a sample's loss changes over training time to better identify which samples to keep or discard, achieving up to 6.3% accuracy improvements.