A compact, quantized text model optimized for efficiency through NVIDIA's NVFP4 precision format. It trades some numerical precision for faster inference and reduced memory footprint, making it practical for deployment on constrained hardware. The open-weight release in safetensors format gives developers direct access to the weights for local use.