A large mixture-of-experts text model with 122 billion total parameters but only 10 billion active at a time, keeping inference costs lower than its parameter count suggests. It handles text-in, text-out tasks and is distributed in NVFP4 quantized safetensors format, meaning it trades some numerical precision for reduced memory footprint. Published by scottgl under Apache 2.0, it's a community-packaged variant of Qwen's third-generation architecture.