A training technique that teaches a model to predict the gradient of data log-probability for use in sampling.