You can now predict how removing training data affects a model's predictions with only polynomial overhead in accuracy, enabling practical data attribution and privacy analysis without expensive retraining.
This paper solves the data deletion problem: predicting how a trained model would behave if specific training data had been excluded, without retraining from scratch. The method uses a 'sketching' technique based on higher-order derivatives to efficiently approximate model behavior with minimal overhead compared to standard training.