MAA enables agents to learn which memory operations consistently help by accumulating cross-batch evidence, making agent self-improvement more efficient and reliable without requiring online training.
This paper addresses a problem in training AI agents: when the same memory operation gets conflicting feedback across different training batches, it's hard to know which operations actually work. MAA solves this by accumulating evidence for each operation across batches and filtering out unreliable ones, improving agent learning while using 75% fewer tokens during training.