Models Capabilities Use Cases Benchmarks Papers Glossary

Models Capabilities Use Cases Benchmarks Papers Glossary

About Privacy Terms RSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary/INT4 Quantization

INT4 Quantization

deployment

A compression technique that reduces a model's size and memory usage by storing weights as 4-bit integers instead of higher-precision numbers, making it faster and cheaper to run with minimal accuracy loss.

INT4 Quantization — Glossary — ThinkLLM