A specific quantization format that represents model weights using only 4 bits per value, significantly reducing model size while maintaining reasonable performance.