A mid-sized multimodal model from Google's Gemma 4 family, quantized to 4-bit precision by the MLX community for efficient local inference on Apple Silicon. The 4-bit quantization reduces memory footprint significantly, making it runnable on consumer hardware, though with some quality trade-off compared to full-precision versions. It handles both text and image inputs, offering a practical balance between capability and resource use.