A vision-capable text model built on Qwen3's 27B architecture, this variant has been processed with abliteration — a technique that removes refusal behaviors — and quantized to NVFP4 precision for efficient inference. It accepts both text and image inputs, making it multimodal, though the community fine-tuning origin means consistency and safety guardrails are intentionally absent.