Qwen3.6 35B A3B heretic NVFP4

Qwen3.6

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 2026context N/A

A mid-sized mixture-of-experts model that activates roughly 3 billion parameters per forward pass despite having 35 billion total, keeping inference costs lean while retaining broad capacity. It handles both text and image inputs, making it usable for visual question-answering and document understanding alongside standard text tasks. The NVFP4 quantization means it runs efficiently on compatible NVIDIA hardware, though that format limits deployment flexibility on other stacks.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Instruction Following

Strong

Reasoning & Logic

Use Case Fit

Fit scores are AI-generated based on model capabilities, intended use, and technical specifications. Learn more

Qwen3.6 35B A3B heretic NVFP4

Qwen3.6

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 2026context N/A

A mid-sized mixture-of-experts model that activates roughly 3 billion parameters per forward pass despite having 35 billion total, keeping inference costs lean while retaining broad capacity. It handles both text and image inputs, making it usable for visual question-answering and document understanding alongside standard text tasks. The NVFP4 quantization means it runs efficiently on compatible NVIDIA hardware, though that format limits deployment flexibility on other stacks.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Instruction Following

Strong

Reasoning & Logic

Use Case Fit

Fit scores are AI-generated based on model capabilities, intended use, and technical specifications. Learn more

Glossary

Document UnderstandingThe ability to read and extract meaningful information from structured documents like receipts, invoices, and forms by recognizing both text and layout.Forward PassA single computation cycle where input data flows through the model's layers to produce an output prediction.InferenceThe process of running a trained model to generate predictions or outputs from new inputs.ParametersThe learned numerical values in a model — more parameters generally means more capacity but higher compute cost.QuantizationReducing a model's numerical precision (e.g., from 16-bit to 4-bit) to shrink memory usage and speed up inference.