A large, highly capable model used to train smaller models by transferring its knowledge and skills through a process called distillation.
Multi-step reasoning, logic puzzles, mathematical problem-solving