DeepSeek V3 is a large mixture-of-experts model that punches well above its weight class for coding, math, and reasoning tasks. It's notably efficient at inference time despite its massive parameter count, activating only a fraction of its weights per token. The open-weight release makes it inspectable and self-hostable, though running it comfortably requires serious hardware.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| HellaSwag | 88.9 | accuracy | 5d ago |
| AIME 2024 | 39.2 | accuracy | 5d ago |
| SciCode | 3.1 | main_problem_pass@1 | 5d ago |
| Aider Polyglot | 48.4 | accuracy | 5d ago |
| ARC-Challenge | 95.3 | accuracy | 5d ago |
| WinoGrande | 84.9 | accuracy | 5d ago |