Qwen3.5 122B A10B NVFP4 GB10

Qwen 3

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 2026context N/A122B params

A large mixture-of-experts text model with 122 billion total parameters but only 10 billion active at a time, keeping inference costs lower than its parameter count suggests. It handles text-in, text-out tasks and is distributed in NVFP4 quantized safetensors format, meaning it trades some numerical precision for reduced memory footprint. Published by scottgl under Apache 2.0, it's a community-packaged variant of Qwen's third-generation architecture.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Instruction Following

Strong

Long Context

Use Case Fit

Fit scores are AI-generated based on model capabilities, intended use, and technical specifications. Learn more

Qwen3.5 122B A10B NVFP4 GB10

Qwen 3

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 2026context N/A122B params

A large mixture-of-experts text model with 122 billion total parameters but only 10 billion active at a time, keeping inference costs lower than its parameter count suggests. It handles text-in, text-out tasks and is distributed in NVFP4 quantized safetensors format, meaning it trades some numerical precision for reduced memory footprint. Published by scottgl under Apache 2.0, it's a community-packaged variant of Qwen's third-generation architecture.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Instruction Following

Strong

Long Context

Use Case Fit

Fit scores are AI-generated based on model capabilities, intended use, and technical specifications. Learn more

Glossary

ArchitectureThe underlying structural design of a neural network that defines how data flows through layers and components.InferenceThe process of running a trained model to generate predictions or outputs from new inputs.Memory FootprintThe amount of RAM or storage space a model requires to run, which is critical for deployment on resource-constrained devices.Parameter CountThe total number of adjustable weights in a model; more parameters generally mean more capacity to learn, but also require more computing power.ParametersThe learned numerical values in a model — more parameters generally means more capacity but higher compute cost.PrecisionThe level of numerical detail a model uses to represent its internal values; higher precision means more accurate calculations but requires more memory.QuantizedA technique that reduces a model's size and memory usage by storing weights with lower precision (fewer bits), trading some accuracy for efficiency.SafetensorsA safe, fast file format for storing model weights, designed to prevent code execution vulnerabilities.Safetensors FormatA secure and efficient file format for storing model weights that prioritizes safety and speed when loading models.Text ModelA language model that processes and generates only text, without support for images, audio, or other media types.Text-In, Text-OutA model that accepts text as input and produces text as output, without support for images, audio, or other data types.