A quantized multimodal model that accepts both text and image inputs, running at 4.75-bit mixed precision (NVFP4/BF16) optimized for vllm inference on Blackwell hardware. At 35B parameters with an A3B architecture, it balances memory efficiency with scale. Details about its specific reasoning or task strengths are limited beyond its technical configuration.