A quantized, memory-efficient version of DeepSeek's V4 Flash model, packaged by the MLX community for Apple Silicon hardware. The 4-bit quantization reduces memory footprint significantly, making it practical to run locally on Macs, though with some trade-off in precision compared to full-weight versions. It handles text-in, text-out tasks with a remarkably large context window of over one million tokens.