A compact but capable generalist that punches above its weight class for its size. Handles multilingual tasks with particular strength in Chinese and English, and holds its own on structured reasoning and code generation. At 7.6B parameters, it occasionally hits walls on complex multi-step problems where larger models have more headroom.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| GPQA Diamond | 10.0 | accuracy | 29d ago |
| BBH | 35.8 | accuracy | 29d ago |
| IFEval | 33.7 | accuracy | 29d ago |
| MuSR | 14.1 | accuracy | 29d ago |
| MMLU-Pro | 37.4 | accuracy | 29d ago |
| MATH | 25.1 | accuracy | 29d ago |