A compact multimodal model from vectionlabs that handles both text and image inputs. At 9B parameters, it sits in the mid-range weight class where capability and resource requirements are balanced. Details about its specific strengths, fine-tuning focus, or intended use cases are limited beyond its multimodal input support.