A training technique that limits gradient computation to recent steps instead of the full history, reducing memory and computation.