Claude 3.7 Sonnet thinks carefully before it speaks — it can switch between a quick-response mode and an extended reasoning mode where it visibly works through problems step by step. It handles nuanced writing, coding, and analysis with a notably conversational tone, though it can be cautious and occasionally over-explains. Its extended thinking mode shines on complex multi-step problems.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| SWE-Bench | 70.3 | accuracy | 5d ago |
| TerminalBench | 35.2 | accuracy | 5d ago |
| TAU2 | 58.4 | accuracy | 5d ago |
| AIME 2024 | 80.0 | accuracy | 5d ago |
| Aider Polyglot | 64.9 | accuracy | 5d ago |