A compact multimodal model from the Qwen 3 family, published under trl-internal-testing, suggesting it's oriented toward internal experimentation and library integration work rather than production deployment. The 'NoThink' designation implies chain-of-thought reasoning is disabled or bypassed, favoring faster, direct responses over deliberative output. Its small footprint makes it accessible for local testing, though its experimental provenance means stability and documentation may be limited.