Think
LLM
Models
Capabilities
Use Cases
Benchmarks
Papers
Glossary
Search
/
Glossary
/
Inference Throughput
Inference Throughput
techniques
The number of predictions a model can generate per unit time, measuring inference speed.
Inference Throughput — Glossary — ThinkLLM