A compact multimodal model that punches at its weight class — handling both text and image inputs with the efficiency you'd expect from a quantized 4B parameter architecture. The FP8 precision keeps memory footprint lean while preserving reasonable capability, though the trade-offs of aggressive quantization on a smaller model mean complex reasoning tasks may show more strain than larger counterparts. It operates with an open, permissive license, making it straightforward to deploy and modify.