A compact, open-weight model from NVIDIA's quantized model lineup, optimized with NV-FP4 precision to run efficiently on NVIDIA hardware. The FP4 quantization means it trades some numerical precision for significantly reduced memory footprint and faster inference, making it practical for deployment on consumer or workstation GPUs. It handles text-in, text-out tasks and ships in the safetensors format, ready for straightforward integration.