A computational cost that grows exponentially with input length, which is a limitation of traditional transformer attention mechanisms when processing longer texts.
Performance retention over long documents and conversations