A technique that groups similar items together using hashing, allowing the model to attend to relevant parts of long text without comparing every token to every other token.
Performance retention over long documents and conversations