A compression technique that reduces model size by representing weights using 8-bit floating-point numbers instead of higher precision, making it faster and more memory-efficient.