A compact but capable generalist that punches above its weight class for its size. Handles multilingual tasks with particular strength in Chinese and English, and holds its own on structured reasoning and code generation. At 7.6B parameters, it occasionally hits walls on complex multi-step problems where larger models have more headroom.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| IFEval | 33.7 | accuracy | 2mo ago |
| MATH | 25.1 | accuracy | 2mo ago |
| GPQA Diamond | 10.0 | accuracy | 2mo ago |
| MuSR | 14.1 | accuracy | 2mo ago |
| MMLU-Pro | 37.4 | accuracy | 2mo ago |
| BBH | 35.8 | accuracy | 2mo ago |