A text-in, text-out model with a notably large context window of 262,144 tokens, allowing it to work across very long documents or conversations without losing track of earlier content. It uses a mixture-of-experts style architecture (30B total parameters, 3B active) with NVFP4 quantization, meaning it runs leaner than its full parameter count suggests. Published by chankhavu rather than NVIDIA directly, so provenance and support details are less clear than an official release.