A distillation technique that aligns statistical properties (moments) between a teacher and student model.