A file format designed for efficient storage and loading of large language and embedding models, optimized for fast inference on various hardware.