A compact, open-weight model from Google's Gemma family, quantized to 4-bit precision for efficient local deployment via MLX. The 4-bit quantization means it runs well on Apple Silicon hardware with reduced memory overhead, though some precision is traded off compared to full-weight versions. It handles text-in, text-out tasks and is straightforward to run locally without cloud dependencies.