A training technique that allows a single embedding model to produce high-quality results at multiple vector sizes, letting you shrink the embedding dimensions to save storage and speed without retraining.
Performance retention over long documents and conversations