A low-precision numerical format that uses 4 bits per weight, developed by NVIDIA to compress models for efficient inference on consumer hardware.