A compact multimodal base model that punches above its weight class for its size, handling both text and image inputs with reasonable fluency. As a base model, it lacks instruction-following polish — it continues text rather than answering questions, making it a raw ingredient for fine-tuning rather than a ready-to-use assistant. Its Apache 2.0 license makes it freely adaptable for commercial projects.