A compact, open-weight model that runs locally via MLX with 8-bit quantization, making it well-suited for on-device inference on Apple Silicon hardware. It handles text-in, text-out tasks and reflects Google's Gemma 4 architecture at the 12B parameter scale. The quantization keeps memory footprint manageable while accepting some trade-off in raw precision.