A specific quantization method that represents model weights using 4-bit numbers instead of higher-precision formats, significantly reducing model size while accepting some loss in accuracy.