A multimodal open-weight model that handles both text and image inputs, packaged in an 8-bit quantized MLX format optimized for Apple Silicon hardware. It sits in a mid-size range that balances capability with local deployment practicality. The quantization means reduced memory footprint compared to full precision, with the usual trade-off of slight quality reduction.