A large mixture-of-experts model that activates only 2 billion parameters per forward pass despite its 24B total size, keeping inference lean while retaining broad knowledge. It handles text tasks with reasonable depth but reflects the trade-offs of aggressive quantization — 4-bit precision trims memory use at the cost of some fidelity. Optimized for Apple Silicon via MLX, it runs locally on Mac hardware without a GPU cluster.