A technique applied during model inference without retraining that adjusts how the model generates outputs.