Models Capabilities Use Cases Benchmarks Papers Glossary

Models Capabilities Use Cases Benchmarks Papers Glossary

About Privacy Terms RSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary/FP4 Quantization

FP4 Quantization

formats

A compression technique that represents model weights using only 4-bit floating-point numbers instead of larger formats, reducing memory usage and speeding up inference.

FP4 Quantization — Glossary — ThinkLLM