A mid-sized mixture-of-experts model that activates roughly 3 billion parameters per forward pass despite having 35 billion total, keeping inference costs lean while retaining broad capacity. It handles both text and image inputs, making it usable for visual question-answering and document understanding alongside standard text tasks. The NVFP4 quantization means it runs efficiently on compatible NVIDIA hardware, though that format limits deployment flexibility on other stacks.