A mid-sized mixture-of-experts model from Alibaba's Qwen 3 family, quantized to 4-bit precision by Unsloth for efficient local inference via the MLX framework. It activates around 3 billion parameters per forward pass despite having 35 billion total, keeping compute costs low while retaining broad capability across text and image inputs. The trade-off is some quality loss from aggressive quantization, which may surface on nuanced or precision-sensitive tasks.