A specific type of quantization that compresses only the model's learned parameters (weights) while keeping other calculations at higher precision.