Context-space learning (where agents update their internal state/behavior through reflection) should be treated as a systematic optimization problem, not ad hoc tricks—applying proven techniques like batching and credit assignment significantly improves results.
This paper studies how AI agents learn by updating their context (internal state) rather than their weights, similar to how humans reflect on experiences.