A compact, quantized variant optimized for local inference on Apple Silicon hardware. The 4-bit quantization means it trades some precision for dramatically reduced memory footprint, making it approachable on consumer machines. It handles general text tasks competently within those constraints, though the compression introduces occasional roughness in nuanced or complex reasoning.