A neural network architecture that processes sequences by tracking hidden states over time, offering faster inference and lower memory use than traditional transformers.
Performance retention over long documents and conversations